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Preface 


Linear algebra has in recent years become an essential part of the mathematical background required by 
mathematicians and mathematics teachers, engineers, computer scientists, physicists, economists, and statis¬ 
ticians, among others. This requirement reflects the importance and wide applications of the subject matter. 

This book is designed for use as a textbook for a formal course in linear algebra or as a supplement to all 
current standard texts. It aims to present an introduction to linear algebra which will be found helpful to all 
readers regardless of their fields of specification. More material has been included than can be covered in most 
first courses. This has been done to make the book more flexible, to provide a useful book of reference, and to 
stimulate further interest in the subject. 

Each chapter begins with clear statements of pertinent definitions, principles, and theorems together with 
illustrative and other descriptive material. This is followed by graded sets of solved and supplementary 
problems. The solved problems serve to illustrate and amplify the theory, and to provide the repetition of basic 
principles so vital to effective learning. Numerous proofs, especially those of all essential theorems, are 
included among the solved problems. The supplementary problems serve as a complete review of the material 
of each chapter. 

The first three chapters treat vectors in Euclidean space, matrix algebra, and systems of linear equations. 
These chapters provide the motivation and basic computational tools for the abstract investigations of vector 
spaces and linear mappings which follow. After chapters on inner product spaces and orthogonality and on 
determinants, there is a detailed discussion of eigenvalues and eigenvectors giving conditions for representing a 
linear operator by a diagonal matrix. This naturally leads to the study of various canonical forms, specifically, 
the triangular, Jordan, and rational canonical forms. Later chapters cover linear functions and the dual space V*, 
and bilinear, quadratic, and Hermitian forms. The last chapter treats linear operators on inner product spaces. 

The main changes in the sixth edition are that some parts in Appendix D have been added to the main part of 
the text, that is, Chapter Four and Chapter Eight. There are also many additional solved and supplementary 
problems. 

Finally, we wish to thank the staff of the McGraw-Hill Schaum’s Outline Series, especially Diane Grayson, 
for their unfailing cooperation. 


Seymour Lipschutz 
Marc Lars Lipson 
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Vectors in R n and C n , 
Spatial Vectors 


1.1 Introduction 


There are two ways to motivate the notion of a vector: one is by means of lists of numbers and subscripts, 
and the other is by means of certain objects in physics. We discuss these two ways below. 

Here we assume the reader is familiar with the elementary properties of the field of real numbers, 
denoted by R. On the other hand, we will review properties of the field of complex numbers, denoted by 
C. In the context of vectors, the elements of our number fields are called scalars. 

Although we will restrict ourselves in this chapter to vectors whose elements come from R and then 
from C, many of our operations also apply to vectors whose entries come from some arbitrary field K. 

Lists of Numbers 

Suppose the weights (in pounds) of eight students are listed as follows: 

156, 125, 145, 134, 178, 145, 162, 193 

One can denote all the values in the list using only one symbol, say w, but with different subscripts; that is, 
Wj, w 2 , w 3 , w 4 , w 5 , w 6 , w 7 , w 8 

Observe that each subscript denotes the position of the value in the list. For example, 
w | = 156, the first number, vv 2 = 125, the second number, ... 

Such a list of values, 

W = (w l5 w 2 ,w 3 , ...,Wg) 
is called a linear array or vector. 

Vectors in Physics 

Many physical quantities, such as temperature and speed, possess only “magnitude.” These quantities can 
be represented by real numbers and are called scalars. On the other hand, there are also quantities, such as 
force and velocity, that possess both “magnitude” and “direction.” These quantities, which can be 
represented by arrows having appropriate lengths and directions and emanating from some given reference 
point O, are called vectors. 

Now we assume the reader is familiar with the space R 3 where all the points in space are represented by 
ordered triples of real numbers. Suppose the origin of the axes in R 3 is chosen as the reference point O for 
the vectors discussed above. Then every vector is uniquely determined by the coordinates of its endpoint, 
and vice versa. 

There are two important operations, vector addition and scalar multiplication, associated with vectors in 
physics. The definition of these operations and the relationship between these operations and the endpoints 
of the vectors are as follows. 
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(a + a',b + b', c + c') 




Figure 1-1 

(i) Vector Addition: The resultant u + v of two vectors u and v is obtained by the parallelogram law, 
that is, u + v is the diagonal of the parallelogram formed by u and v. Furthermore, if (a, b, c ) and 
(a 1 , //, d) are the endpoints of the vectors u and v, then ( a + a’, b + //. c + d) is the endpoint of the 
vector u + v. These properties are pictured in Fig. 1-1 (a). 

(ii) Scalar Multiplication: The product ku of a vector u by a real number k is obtained by multiplying 
the magnitude of u by k and retaining the same direction if k > 0 or the opposite direction if k < 0. 
Also, if (a, b, c) is the endpoint of the vector u, then (ka, kb, kc) is the endpoint of the vector ku. These 
properties are pictured in Fig. l-l(b). 

Mathematically, we identify the vector u with its (a, b, c ) and write u = ( a , b, c). Moreover, we call the 
ordered triple (a, b, c ) of real numbers a point or vector depending upon its interpretation. We generalize 
this notion and call an //-tuple (a 1 ,a 2 , ■ ■ ■ ,a n ) of real numbers a vector, ffowever, special notation may be 
used for the vectors in R called spatial vectors (Section 1.6). 


1.2 Vectors in R" 

The set of all /7-tuples of real numbers, denoted by R", is called n-space. A particular //-tuple in R", say 
u = (a u a 2 , ...,a n ) 

is called a point or vector. The numbers a, are called the coordinates, components, entries, or elements 
of u. Moreover, when discussing the space R", we use the term scalar for the elements of R. 

Two vectors, u and v, are equal, written u = v, if they have the same number of components and if the 
corresponding components are equal. Although the vectors (1,2,3) and (2,3,1) contain the same three 
numbers, these vectors are not equal because corresponding entries are not equal. 

The vector (0,0,..., 0) whose entries are all 0 is called the zero vector and is usually denoted by 0. 

EXAMPLE 1.1 

(a) The following are vectors: 

(2,-5), (7,9), (0,0,0), (3,4,5) 

The first two vectors belong to R 2 , whereas the last two belong to R\ The third is the zero vector in R . 

(b) Find x, y, z such that (x — y, x + y, z— 1) = (4,2,3). 

By definition of equality of vectors, corresponding entries must be equal. Thus, 

x — y — 4, a-~ y — 2, z — 1 = 3 


Solving the above system of equations yields x = 3, y = — 1, z = 4. 
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Column Vectors 

Sometimes a vector in 72 -space R" is written vertically rather than horizontally. Such a vector is called a 
column vector, and, in this context, the horizontally written vectors in Example 1.1 are called row vectors. 
For example, the following are column vectors with 2,2,3, and 3 components, respectively: 


V 


3' 


r 

5 


' 1.5' 

2 

2 

1 

-4 

5 

-6 

1 

3 

.-15. 


We also note that any operation defined for row vectors is defined analogously for column vectors. 


1.3 Vector Addition and Scalar Multiplication 

Consider two vectors u and v in R", say 

u = (fli,a 2 ,... ,a„) and v = (b u b 2 ,..., b n ) 

Their sum, written u + v, is the vector obtained by adding corresponding components from u and v. That is, 
u T v = (a, + b u a 2 + b 2 , ..., a n + b n ) 

The product, of the vector u by a real number k, written ku, is the vector obtained by multiplying each 
component of u by k. That is, 

ku = k(a l ,a 2 ,..., a n ) = ( ka l ,ka 2 ,..., ka n ) 

Observe that u + v and ku are also vectors in R". The sum of vectors with different numbers of 
components is not defined. 

Negatives and subtraction are defined in R" as follows: 

—u = (—1 )u and u — v — u + ( — v) 

The vector —a is called the negative of u, and u — v is called the difference of a and v. 

Now suppose we are given vectors u 1 , u 2 ,..., u m in R" and scalars k l ,k 2 ,... ,k m in R. We can multiply 
the vectors by the corresponding scalars and then add the resultant scalar products to form the vector 

v = k { u { + k 2 u 2 + k i u 3 + ■■■ + k m u m 

Such a vector v is called a linear combination of the vectors u j, u 2 ,..., u m . 

EXAMPLE 1.2 

(a) Let u = (2,4, —5) and v = (1, —6,9). Then 

« + t> = (2+l, 4 +(-6), —5 + 9) - (3, -2,4) 

7h = (7(2),7(4),7(-5)) = (14,28, -35) 

— u = (—1)(1, —6,9) = (—1,6, —9) 

3u -5v= (6,12, -15) + (-5,30, -45) = (1,42, -60) 

(b) The zero vector 0 = (0,0.0) in R" is similar to the scalar 0 in that, for any vector u = (a 1 ,a 2 , ■ • • ,fl„). 

u + 0 = (aj + 0, a 2 + 0, ..., a n + 0) — (a { , a 2 ,..., a n ) = u 



2' 


3' 


4' 


'-9' 


;V5" 

(c) Let u = 

3 

and v = 

-1 

. Then 2 u — 3v = 

6 

+ 

3 

= 

9 


-4 


-2 


-8 


6 


-2 
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Basic properties of vectors under the operations of vector addition and scalar multiplication are 
described in the following theorem. 

THEOREM 1.1: For any vectors u, v, w in R" and any scalars k. k' in R, 

(i) (u + v) + w = u + (v + w), (v) k(u + v) = ku + kv, 

(ii) u + 0 = u, (vi) (k + k!)u = ku + k'u, 

(iii) u + (—u) — 0, (vii) ( kk')u = k(kfu), 

(iv) u + v = v + u, (viii) lu = u. 

We postpone the proof of Theorem 1.1 until Chapter 2, where it appears in the context of matrices 
(Problem 2.3). 

Suppose u and v are vectors in R" for which u = kv for some nonzero scalar k in R. Then u is called a 
multiple of v. Also, u is said to be in the same or opposite direction as v according to whether k > 0 or 
k < 0. 


1.4 Dot (Inner) Product 

Consider arbitrary vectors u and v in R"; say, 

u = (a u a 2 ,...,a n ) and v = (b u b 2 ,..., b n ) 

The dot product or inner product of u and v is denoted and defined by 
u ■ v — a l b l + a 2 b 2 + • • ■ + a n b n 

That is, u ■ v is obtained by multiplying corresponding components and adding the resulting products. 
The vectors u and v are said to be orthogonal (or perpendicular) if their dot product is zero—that is, if 
u ■ v — 0. 


EXAMPLE 1.3 

(a) Let u = (1, —2,3), v = (4,5, —1), w = (2,7,4). Then, 
u ■ v = 1(4) - 2(5) + 3(— 1) = 4 - 10 - 3 - -9 
u ■ w = 2 — 14 + 12 = 0, v ■ w = 8 + 35 — 4 = 39 

Thus, u and w are orthogonal. 



2' 


3' 

(b) Let u = 

3 

and v = 

-1 


-4 


-2 


(c) Suppose u — ( 1 ,2, 3,4) and v = (6, k, —8,2). Find k so that u and v are orthogonal. 

First obtain u ■ v = 6 + 2k — 24 + $ = —10 + 2 k. Then set u ■ v = 0 and solve for k: 

—10 + 2& = 0 or 2^=10 or k = 5 

Basic properties of the dot product in R" (proved in Problem 1.13) follow. 

THEOREM 1.2: For any vectors u, v, w in R" and any scalar k in R: 

(i) (u + v) ■ w — u ■ w + v ■ w, (iii) u ■ v — v ■ u, 

(ii) (ku) ■ v — k(u ■ v ), (iv) u ■ u > 0, and u ■ it = 0 iff u = 0. 

Note that (ii) says that we can “take k out” from the first position in an inner product. By (iii) and (ii), 

u ■ (kv) = (kv) ■ u = k(v ■ u) = k(u ■ v) 
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That is, we can also "take k out” from the second position in an inner product. 

The space R" with the above operations of vector addition, scalar multiplication, and dot product is 
usually called Euclidean n-space. 

Norm (Length) of a Vector 

The norm or length of a vector u in R", denoted by ||n||, is defined to be the nonnegative square root of 
u ■ u. In particular, if u — (a 1 ,a 2 ,..., a n ), then 

||n|| = y/u ■ u — \J'a\ + a 2 + ■ ■ ■ + a 2 

That is, ||w|| is the square root of the sum of the squares of the components of u. Thus, ||n|| > 0, and 
||w|| = 0 if and only if u = 0. 

A vector u is called a unit vector if ||w|| = 1 or, equivalently, if it ■ u = 1. For any nonzero vector v in 
R", the vector 

. 1 v 

||n|| ||n|| 

is the unique unit vector in the same direction as v. The process of finding v from v is called normalizing v. 

EXAMPLE 1.4 

(a) Suppose u = (1, —2, —4, 5, 3). To find ||w||, we can first find ||«|| 2 = u ■ u by squaring each component of u and 
adding, as follows: 

\\u\\ 2 = l 2 + (-2) 2 + (—4) 2 + 5 2 + 3 2 = 1 + 4 + 16 + 25 + 9 = 55 
Then ||u|| = \/55. 

(b) Let v= (1,-3,4,2) and w = (I, — 5 , 5 , 5 )- Then 

= VT=i 

Thus w is a unit vector, but v is not a unit vector. However, we can normalize v as follows: 

„ _ y _ / i -3 4 2 \ 

W_ N _ v^o’TIo’TIo’^o; 

This is the unique unit vector in the same direction as v. 

The following formula (proved in Problem 1.14) is known as the Schwarz inequality or Cauchy- 
Schwarz inequality. It is used in many branches of mathematics. 

THEOREM 1.3 (Schwarz): For any vectors u,v in R", |u ■ u| < ||«||||u||. 

Using the above inequality, we also prove (Problem 1.15) the following result known as the “triangle 
inequality” or Minkowski’s inequality. 

TH E OREM 1.4 (Minkowski): For any vectors u, v in R”, \\u + u|| < ||w|| + ||t>||. 

Distance, Angles, Projections 

The distance between vectors 11 = (a l ,a 2 , ■ ■ ■, a n ) and v = (b 1 , b 2 , ■ ■ ■, b„) in R" is denoted and defined 
by 

d(u, v) = ||u — v|| = y («i - b 1 ) + (a 2 - b 2 ) H-F [a n - b n y 

One can show that this definition agrees with the usual notion of distance in the Euclidean plane R 2 or 
space R 3 . 


= Vl+9+16 + 4 = V^O 


and 


w = 


9 1 25 1 

36 + 36 + 36 + 36 
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The angle 9 between nonzero vectors u, v in R" is defined by 


cos 9 = 


u ■ v 



This definition is well defined, because, by the Schwarz inequality (Theorem 1.3), 


-1 


u ■ v 



< 1 


Note that if u ■ v — 0, then 9 = 90° (or 9 — n/2). This then agrees with our previous definition of 
orthogonality. 

The projection of a vector u onto a nonzero vector v is the vector denoted and defined by 
u ■ v u ■ v 

proj (w, v) = -, v — - v 

INI" v ' v 

We show below that this agrees with the usual notion of vector projection in physics. 


EXAMPLE 1.5 

(a) Suppose u = (1,—2,3) and v= (2,4,5). Then 

d(u, v ) - \J{ 1 - l) 2 + (-2 - 4) 2 + (3 - 5) 2 - Vl + 36 + 4 = V+T 


To find cos 9, where 9 is the angle between u and v, we first find 

u-v = 2 — 8 + 15 = 9, || M || 2 = 1 +4 + 9 = 14, 

Then 


cos 9 = 


u ■ v 
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Also, 

P ro j(n, v) — — -^(2,4,5) = ^ (2,4,5) 

\\v\\ 45 5 



d| 2 = 4+ 16 + 25 — 45 


(b) Consider the vectors u and v in Fig. l-2(a) (with respective endpoints A and B ). The (perpendicular) projection of 
u onto v is the vector u* with magnitude 


lull cos 9 = ||u|| 


u ■ V 


\u\\v\\ 


U ■ V 

Ill'll 


To obtain «*, we multiply its magnitude by the unit vector in the direction of v, obtaining 

. V U ■ V V u ■ V 
U* = u* —- = . =- 

k; M n |+|r 


This is the same as the above definition of proj(u, v). 


A 



Projection u* of u onto v 



(a) 


(b) 


Figure 1-2 
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1.5 Located Vectors, Hyperplanes, Lines, Curves in R n 

This section distinguishes between an /(-tuple P(a t ) = P(a l ,a 2 , ■ ■ ■ ,a n ) viewed as a point in R" and an 
//-tuple ii — [c |. c 2 ,..., c n ) viewed as a vector (arrow) from the origin O to the point C(c l , c 2 , ■ ■ ■, c„). 

Located Vectors 

Any pair of points A(a ; ) and B(bj) in R" defines the located vector or directed line segment from A to B, 
written AB . We identify AB with the vector 

« = B -A = [&i - a u b 2 — a 2 , b n - a,,} 

because AB and u have the same magnitude and direction. This is pictured in Fig. l-2(b) for the 
points A(a 1 ,a 2 ,a 2 ) and B(b l ,b 2 ,b 2 ) in R 3 and the vector u — B—A which has the endpoint 
P(b l — a x , b 2 — a 2 , b 2 — a 3 ). 

Hyperplanes 

A hyperplane PI in R" is the set of points (x 1 ,x 2 , ■ ■ ■ ,x„) that satisfy a linear equation 
a x x x + a 2 x 2 + • • • + a n x n = b 

where the vector u — [a l ,a 2 ,..., a n ] of coefficients is not zero. Thus a hyperplane PI in R 2 is a line, and a 
hyperplane H in R 3 is a p lane . We show below, as pictured in Fig. 1 -3(a) for R 3 , that u is orthogonal to 
any directed line segment PQ, where P( p,) and Q{qi) are points in H. [For this reason, we say that u is 
normal to PI and that H is normal to u. ] 




Figure 1-3 

Because P(pj) and Q{q t ) belong to PI, they satisfy the above hyperplane equation—that is, 

a l p l +a 2 p 2 -\ -F a n p„ = b and a l q l +a 2 q 2 ~\ - \-a n q n = b 

Let V =PQ = Q-P= [cp -Pi,q 2 ~p 2 ,...,q n - p n ] 

Then 

u-v = a l (q ] — Pi) + a 2 (q 2 — p 2 ) + ■■■ + a n (q n - p n ) 

= («i?t + a 2 q 2 + • • • + a n q n ) - ( a l p l + a 2 p 2 + • • • + a n p n ) = b-b = 0 


Thus v = PQ is orthogonal to u, as claimed. 
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Lines in R n 

The line L in R" passing through the point P(b l ,b 2 , ■ ■ ., b n ) and in the direction of a nonzero vector 
u = [a l: a 2 , ■ ■ ■ ,a n ] consists of the points X(x 1 ,x 2 ,... ,x n ) that satisfy 

{ x l = a x t + b i 

a' 2 a 2 t + b 2 Qr L(f) = (ajt + bj) 

x n — a n t + b„ 

where the parameter t takes on all real values. Such a line L in R is pictured in Fig. l-3(b). 

EXAMPLE 1.6 

(a) Let H be the plane in R 3 corresponding to the linear equation 2x — 5y + lz = 4. Observe that P( \. 1,1) and 
2(5,4, 2) are solutions of the equation. Thus P and Q and the directed line segment 

v=PQ = Q-P=\ 5-1, 4-1, 2-1] = [4,3,1] 

lie on the plane H. The vector u = [2, —5,7] is normal to H, and, as expected, 

u ■ v — [2, -5,7] ■ [4,3,1] = 8 - 15 + 7 = 0 

That is, u is orthogonal to v. 

(b) Find an equation of the hyperplane H in R 4 that passes through the point P(l,3, —4,2) and is normal to the 
vector u = [4, —2,5, 6 ]. 

The coefficients of the unknowns of an equation of H are the components of the normal vector u ; hence, the 
equation of H must be of the form 


4x x — 2 x 2 + 5x 3 + 6x 4 = k 


Substituting P into this equation, we obtain 
4(1) — 2(3) + 5(—4) + 6(2) = k or 4-6-20+ 12 = k or k = -10 

Thus, 4x l — 2x 2 + 5x 3 + 6 x 4 = —10 is the equation of H. 

(c) Find the parametric representation of the line L in R 4 passing through the point P( 1,2,3, —4) and in the direction 
of u= [5, 6 , —7, 8]. Also, find the point Q on L when t = 1. 

Substitution in the above equation for L yields the following parametric representation: 

x 1 =5r+l, x 2 = 6t + 2, %3 = —It + 3, x 4 = 8f — 4 

or, equivalently, 

L(r) = (5r + 1, 6 t + 2, — It + 3, St — 4) 

Note that t = 0 yields the point P on L. Substitution of t = 1 yields the point 2(6, 8 , —4,4) on L. 


Curves in R" 

Let D be an interval (finite or infinite) on the real line R. A continuous function F: D R" is a curve in 
R". Thus, to each point t G D there is assigned the following point in R": 

F(t) = [F l (t),F 2 (t),...,F n (t)} 

Moreover, the derivative (if it exists) of Fit) yields the vector 

dF.jt) dF 2 jt) dFM 

dt ’ dt dt 


V{t) = 


dF(t) 

dt 
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which is tangent to the curve. Normalizing V(t) yields 


T(0 


m 

imoii 


Thus, T(f) is the unit tangent vector to the curve. (Unit vectors with geometrical significance are often 
presented in bold type.) 


EXAMPLE 1.7 Consider the curve F(t) = [sin t. cos 1 .1] in R 3 . Taking the derivative of F(t) [or each component of 
F(t)] yields 

V(t) = [cos t , — sin t, 1] 

which is a vector tangent to the curve. We normalize V{t). First we obtain 
|| V (?) || 2 — cos 2 t + sin 2 1 + 1 = 1 + 1 = 2 
Then the unit tangent vection T(f) to the curve follows: 

. V(t) Tcosr — sinf 1 

T( ' ) = PTff= [TTT^'viJ 


1.6 Vectors in R 3 (Spatial Vectors), ijk Notation 

Vectors in R 3 , called spatial vectors, appear in many applications, especially in physics. In fact, a special 
notation is frequently used for such vectors as follows: 

i = [1,0,0] denotes the unit vector in the x direction, 

j = [0,1,0] denotes the unit vector in the y direction, 

k = [0,0,1] denotes the unit vector in the z direction. 

Then any vector u = [a, b, c] in R 3 can be expressed uniquely in the form 
u = [a, b, c] = oi + bj + ck 

Because the vectors i,j, k are unit vectors and are mutually orthogonal, we obtain the following dot 
products: 

i i = 1, j j = l, k k = 1 and i • j - 0, i • k = 0, j k = 0 

Furthermore, the vector operations discussed above may be expressed in the ijk notation as follows. 
Suppose 

u — zqi T T ^r^k and v — b^\ T b^ j T b^ k 

Then 

u + v — (a ] + b x )i + (a 2 + b 2 ) j + (a 3 + ^ 3 )k and cn = ca x i + ca 2 j + ca 3 k 
where c is a scalar. Also, 

u ■ v — a x b x + a 2 b 2 + a 3 b 3 and ||n|| = y/u ■ u = \J'a\+ a 2 + a 2 

EXAMPLE 1.8 Suppose u = 3i + 5j — 2k and v = 4i — 8 j + 7k. 

(a) To find u + v, add corresponding components, obtaining u + v = 7i — 3j + 5k 

(b) To find 3n — 2v, first multiply by the scalars and then add: 

3« -2v= (9i + 15j - 6 k) + (— 8 i + 16j - 14k) = i + 31 j - 20k 
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(c) To find u ■ v, multiply corresponding components and then add: 

u ■ v = 12 - 40 - 14 = -42 

(d) To find ||m||, take the square root of the sum of the squares of the components: 

||u|| = V9 + 25 + 4 = V38 


Cross Product 

There is a special operation for vectors u and v in R 3 that is not defined in R" for n ^ 3. This operation is 
called the cross product and is denoted by u x v. One way to easily remember the formula for u x v is to 
use the determinant (of order two) and its negative, which are denoted and defined as follows: 


a b 
c d 


ad — be and 


a b 
c d 


be — ad 


Here a and d are called the diagonal elements and b and c are the nondiagonal elements. Thus, the 
determinant is the product ad of the diagonal elements minus the product be of the nondiagonal elements, 
but vice versa for the negative of the determinant. 

Now suppose u = ap + a^ + n^k and v = b i i + b^] + 63 k. Then 

u x v = (a 2 b 2 — a 2 b 2 )i + (a 3 b l — n 1 fe 3 )j + ( a x b 2 — a 2 ^i)k 


a x 

a 2 

a 3 

j _ 

a { 

a 2 

a 3 

j + 

a { 

a 2 a 3 


b 2 

b 3 



b 2 

b 3 


b 2 b 3 


That is, the three components of u x v are obtained from the array 

Q-Y ^3 

b\ b 2 b 2 _ 

(which contain the components of u above the component of v) as follows: 


(1) Cover the first column and take the determinant. 

(2) Cover the second column and take the negative of the determinant. 

(3) Cover the third column and take the determinant. 


Note that u x v is a vector; hence, u x v is also called the vector product or outer product of u 
and v. 


EXAMPLE 

(a) Use 

(b) Use [ 


1.9 Find u x v where: (a) u = 4i + 3j + 6k, v = 2i + 5j — 3k, (b) u = [2, —1,5], 
to get u x v = (—9 — 30)i + (12 + 12)j + (20 — 6)k = —39i + 24j 


4 3 6 

2 5-3 


2-15 
3 7 6 


to get u x. v — [—6 — 35,15 — 12,14 + 3] = [—41,3,17] 


v= [3,7,6]. 

+ 14k 


Remark: The cross products of the vectors i,j,k are as follows: 

i x j = k, j x k = i, kxi^j 

j x i = -k, k x j = -i, i x k = -j 

Thus, if we view the triple (i, j, k) as a cyclic permutation, where i follows k and hence k precedes i, then 
the product of two of them in the given direction is the third one, but the product of two of them in the 
opposite direction is the negative of the third one. 


Two important properties of the cross product are contained in the following theorem. 
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Volume = u-v X w 


(a) 



Figure 1-4 


THEOREM 1.5: Let w, v. w be vectors in R 3 . 

(a) The vector u x v is orthogonal to both u and v. 

(b) The absolute value of the “triple product” 

u ■ v x w 

represents the volume of the parallelepiped formed by the vectors u, v, w. 
[See Fig. l-4(a).] 

We note that the vectors u, v, u x v form a right-handed system, and that the following formula gives 
the magnitude of u x v: 

||u x n|| — ||n||||t;|| sind 

where 6 is the angle between u and v. 


1.7 Complex Numbers 

The set of complex numbers is denoted by C. Formally, a complex number is an ordered pair (a, b) of real 
numbers where equality, addition, and multiplication are defined as follows: 

(a, b) = (c, d) if and only if a — c and b — d 
(a, b ) + (c, d) — (a + c, b + d) 

(a, b) ■ (c, d) — (ac — bd , ad + be) 

We identify the real number a with the complex number (a, 0); that is, 

a <-> (a, 0) 

This is possible because the operations of addition and multiplication of real numbers are preserved under 
the correspondence; that is, 

(a, 0) + (b, 0) = (a + b, 0) and (a,0) ■ (b,Q) — (ab,0) 

Thus we view R as a subset of C, and replace (a, 0) by a whenever convenient and possible. 

We note that the set C of complex numbers with the above operations of addition and multiplication is a 
field of numbers, like the set R of real numbers and the set Q of rational numbers. 
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The complex number (0,1) is denoted by i. It has the important property that 

i 2 = ii= (o, 1 )( 0 , 1 ) = (- 1 , 0 ) = -1 or ; = V^T 

Accordingly, any complex number z — (a, b) can be written in the form 

z = (a, b) = (a, 0 ) + ( 0 , b) — (a, 0 ) + (b, 0 ) • ( 0 , 1 ) = a + bi 

The above notation z = a + bi, where a = Re z and b — 1m z arc called, respectively, the real and 

imaginary parts of z, is more convenient than ( a,b). In fact, the sum and product of complex numbers 

z — a + bi and w = c + di can be derived by simply using the commutative and distributive laws and 

i 2 — — 1 : 

z + w — (a + bi) + (c + di) = a + c + bi + di = (a + b) + (c + d)i 

zw — (a + bi)(c + di) = ac + bci + adi + bdi 2 — (ac — bd) + (be + ad)i 

We also define the negative of z and subtraction in C by 

—z=—lz and w — z = w+(—z) 

Warning: The letter i representing \J— 1 has no relationship whatsoever to the vector i = [1,0,0] in 
Section 1.6. 

Complex Conjugate, Absolute Value 

Consider a complex number z = a + bi. The conjugate of z is denoted and defined by 
z = a + bi=a — bi 

Then zz— (a + bi)(a — bi) = a 2 — b 2 i 2 = a 2 + b 2 . Note that z is real if and only if z = z. 

The absolute value of z, denoted by |z|, is defined to be the nonnegative square root of zz. Namely, 

|z| = v& = V a 2 + b 2 

Note that |z| is equal to the norm of the vector (a. h) in R 2 . 

Suppose z 7 ^ 0. Then the inverse z 

_j z a b 

ZZ a 2 + b 2 a 2 + b 2 

EXAMPLE 1.10 Suppose z = 2 + 3 i and w = 5 — 2 i. Then 

z T w — (2 + 3 i) T (5 — 2 i) = 2 + 5 + 3/ — 2/ = 7 + i 
zw = (2 + 3i)(5 — 2 i) = 10 + 15; — 4; — 6 i 2 = 16 + 11; 
z = 2 + 3; = 2 — 3; and w = 5 — 2; = 5 + 2; 

w _5 — 2;_(5 — 2;)(2 - 3;') 4—19/ 4 19 . 

7 “ 2 T 3 / “ “ 13 I3 l 

|z| = v^+9 = Vl3 and |w| = ^25 + 4 = v/29 

Complex Plane 

Recall that the real numbers R can be represented by points on a line. Analogously, the complex numbers 
C can be represented by points in the plane. Specifically, we let the point (a, b) in the plane represent the 
complex number a + bi as shown in Fig. 1 -4(b). In such a case, |z| is the distance from the origin O to the 
point z. The plane with this representation is called the complex plane, just like the line representing R is 
called the real line. 


of z and division in C of w by z are given, respectively, by 


. w wz _1 

and - - = wz 

z zz 
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1.8 Vectors in C" 

The set of all //-tuples of complex numbers, denoted by C", is called complex n-spcice. Just as in the real 
case, the elements of C" are called points or vectors, the elements of C are called scalars, and vector 
addition in C" and scalar multiplication on C" are given by 

[Zl,Z 2 ,---,Z„] + [w l ,w 2 ,...,w n \ = [zt + W+ Z 2 + W 2 , ..., z n + w„} 

Z^Z I ■ z 2 ■> ■ * • 5 Z/j\ ['-■ i ■ ZZ 2 , • • • 5 ZZ n \ 

where the z,-, vv„ and z belong to C. 

EXAMPLE 1.11 Consider vectors u = [2 + 3z', 4 — i, 3] and v = [3 — 2 i, 5 i, 4 — 6 z] in C 3 . Then 

u + v = [2 + 3 i, 4 — i, 3] + [3 — 2 i, 5i, 4 — 6 /'] — [5 + i, 4 + 4/, 7 — 6 /] 

(5-2 i)u = [(5 — 2/) (2 + 3i), (5-2i)(4-i), (5 — 2f)(3)] = [16 + Hi, 18-13 i, 15-6/] 

Dot (Inner) Product in C n 

Consider vectors u = [z 1; z 2 , ■ ■ ■, z„] and v — [w,, w 2 , ..., w n \ in C". The dot or inner product of it and v is 
denoted and defined by 

u ■ v — ZiWj + z 2 w 2 H-+ z n w n 

This definition reduces to the real case because vt> ; = w, when vv, is real. The norm of u is defined by 

IMI = = sjz iZi + z 2 z 2 + ■ ■ ■ + z„z n = a/ |zi | 2 + |z 2 | 2 + • ■ ■ + kl 2 

We emphasize that u ■ u and so ||n|| are real and positive when h^O and 0 when it = 0. 

EXAMPLE 1.12 Consider vectors it = [2 + 3 i, 4 — i, 3 + 5/] and v = [3 — 4i, 5 i, 4 — 2i\ in C 3 . Then 

u ■ v = (2 + 3/)(3^47) + (4 - i)(57) + (3 + 5/)(4^27) 

= (2 + 3/)(3 + 4/) + (4 — /)(—5z) + (3 + 5i)(4 + 2 i) 

= (-6 + 13/) + (-5 -20/) + (2 + 26/) = -9+ 19/ 
u ■ u — 12 + 3/1 2 + 14 — /1 2 + 13 + 5/1 2 = 4 + 9 + 16+1+9 + 25 = 64 
\\u\\ — V64 = 8 

The space C" with the above operations of vector addition, scalar multiplication, and dot product, is 
called complex Euclidean n-space. Theorem 1.2 for R" also holds for C" if we replace u ■ v — v ■ u by 

u ■ V — u ■ V 

On the other hand, the Schwarz inequality (Theorem 1.3) and Minkowski’s inequality (Theorem 1.4) are 
true for C" with no changes. 


SOLVED PROBLEMS 


Vectors in R" 

1.1. Determine which of the following vectors are equal: 

«i = (1,2,3), u 2 — (2,3,1), u 3 = (1,3,2), u 4 = (2,3,1) 


Vectors are equal only when corresponding entries are equal; hence, only u 2 = u A . 
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1.2. Let u = (2,-7,1), v = (-3,0,4), w = (0,5, -8). Find: 

(a) 3 u — 4v, 

(b) 2 u + 3v — 5w. 

First perform the scalar multiplication and then the vector addition. 

(a) 3m — 4v = 3(2, —7,1) -4(-3,0,4) = (6, -21,3) + (12,0,-16) = (18,-21,-13) 

(b) 2m + 3v- 5 w = (4, -14,2) + (-9,0,12) + (0, -25,40) = (-5, -39,54) 


1.3. 


Let u = 


5' 


"-T 


3" 

3 

,v = 

5 

,w = 

-1 

-4 


2 


-2 


(a) 5 u — 2 v, 

(b) —2 u + 4v — 3vv. 


First perform the scalar multiplication and then the vector addition: 


(a) 



5' 


7 1 


25' 


2' 


27' 

5m — 2v = 5 

3 

- 2 

5 

= 

15 

+ 

-10 

= 

5 


-4 


2 


-20 


-4 


-24 


(b) 


—2 u + 4v — 3 w = 


'-10' 


'-4" 


—9" 


'-23' 

-6 

+ 

20 

+ 

3 

— 

17 

8 


8 


6 


22 


1.4. Find x and y, where: (a) (x, 3) = (2, x + y), (b) (4,_y) = x(2, 3). 

(a) Because the vectors are equal, set the corresponding entries equal to each other, yielding 

x = 2 , 3 = x + y 

Solve the linear equations, obtaining x = 2, y = 1. 

(b) First multiply by the scalar x to obtain (4,y) = (2x, 3x). Then set corresponding entries equal to each 
other to obtain 

4 = 2x, y = 3x 
Solve the equations to yield x = 2, y = 6 . 

1.5. Write the vector v — (1, —2,5) as a linear combination of the vectors m, = (1,1,1), u 2 — (1,2, 3 ), 

“3 = ( 2 ,- 1 , 1 ). 

We want to express v in the form v = xu t +yu 2 + zu 3 with x,y,z as yet unknown. First we have 


1 ' 


V 


T 


2' 


x+ y+2z 

-2 

= X 

1 

+ y 

2 

+ z 

-1 

= 

x + 2y - z 

5 


1 


3 


1 


_x+3y+ z_ 


(It is more convenient to write vectors as columns than as rows when forming linear combinations.) Set 
corresponding entries equal to each other to obtain 

x+ y + 2z= 1 x+ y + 2z= 1 x + y + 2z= 1 

*+2y- z=— 2 or y— 3z=— 3 or y - 3z = — 3 

v+3v+ z= 5 2y— z= 4 5s = 10 

This unique solution of the triangular system is x = — 6 , y = 3, z = 2. Thus, v = — 6 m j + 3 u 2 + 2m 3 . 
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1.6. Write v = (2, —5,3) as a linear combination of 

Ml = (1,—3,2 ),« 2 = (2, —4, —1), m 3 = (1, —5,7). 
Find the equivalent system of linear equations and then solve. First, 


2' 


r 


2' 


r 


x + 2y + z 

-5 

= X 

-3 

+ T 

-4 

+ z 

-5 

= 

—3x — 4y — 5z 

3 


2 


-1 


7 


2x - y + lz 


Set the corresponding entries equal to each other to obtain 

x+2y+ z= 2 x+ 2y+ z= 2 x+2y+ z = 2 

—3x — Ay — 5z = — 5 or 2y — 2 z = 1 or 2y — 2z = 1 

2x — y + lz = 3 - 5v + 5z = -1 0 = 3 

The third equation, O.r + Oy + Oz = 3, indicates that the system has no solution. Thus, v cannot be written as a 
linear combination of the vectors u l , u 2 , m 3 . 

Dot (Inner) Product, Orthogonality, Norm in R n 

1.7. Find u ■ v where: 

(a) u = (2, —5, 6) and v = (8, 2, —3), 

(b) u = (4, 2, —3,5, — 1) and v = (2,6, —1, —4, 8). 

Multiply the corresponding components and add: 

(a) u ■ v = 2(8) - 5(2) + 6(-3) = 16 - 10 - 18 = -12 

(b) u ■ v = 8 + 12 + 3 - 20 = 8 = -5 

1.8. Let u = (5,4,1), v = (3, —4,1), w — (1, —2,3). Which pair of vectors, if any, are perpendicular 
(orthogonal)? 

Find the dot product of each pair of vectors: 

u ■ v = 15 — 16+1=0, d-w=3 + 8 + 3=14, u-w = 5 — 8 + 3 = 0 

Thus, u and v are orthogonal, u and w are orthogonal, but v and w are not. 

1.9. Find k so that u and v are orthogonal, where: 

(a) u = (1, k, — 3) and v = (2, — 5,4), 

(b) u = (2, 3k, —4,1, 5) and v = (6, —1,3, 7, 2k). 

Compute u ■ v, set u ■ v equal to 0, and then solve for k: 

(a) it ■ v = 1(2) + k(— 5) — 3(4) = —5k — 10. Then —5k — 10 = 0, or k = —2. 

(b) u • v = 12 — 3k W 12 + 7 + 10k = lk + 7. Then Ik + 7 = 0, or k = -1. 

1.10. Find || u\\, where: (a) m = (3, —12, —4), (b) u — (2, — 3,8, — 7). 

First find ||w|| 2 = u ■ u by squaring the entries and adding. Then ||m|| = \J jjwjj 2 . 

(a) 11 zy 11 2 = (3) 2 + (-12) 2 + (—4) 2 = 9 + 144 + 16 = 169. Then \\u\\ = /169 = 13. 

(b) \\u\\ 2 = 4 + 9 + 64 + 49 = 126. Then \\u\\ = VV26. 
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1.11. Recall that normalizing a nonzero vector v means finding the unique unit vector v in the same 
direction as v, where 

1 

V = n~ii v 

IMI 

Normalize: (a) u=(3,— 4), (b) v = (4, — 2, — 3, 8), (c) w = (j, |, — |). 

(a) First find ||m|| = \J9 + 16 = \/25 = 5. Then divide each entry of u by 5, obtaining u = (|, — |). 

(b) Here ||v|| = ^16 + 4 + 9 + 64 = \/93. Then 

4-2-31 


Wv^’v 7 ^. 

(c) Note that iv and any positive multiple of w will have the same normalized form. Hence, first multiply vv by 
12 to “clear fractions”—that is, first find vv' = 12w = (6, 8, —3). Then 


= \/36 + 64 + 9 = \/!()9 and w = w' = 


_6_8_ -3 \ 

Vm’xfm' Vm) 


1.12. Let u = (1, —3,4) and v = (3,4,7). Find: 

(a) cos 6, where 9 is the angle between u and v; 

(b) proj(n, v), the projection of u onto v; 

(c) d(u,v), the distance between u and v. 

First find u-v= 3-12 + 28 = 19 , || m || 2 = 1 + 9 + 16 = 26 , ||«|| 2 = 9 + 16 + 49 = 74 . Then 


(a) 


cos 6 — 


u • v 

ii M iiii^ii 


(b) proj(«, v) = 

(c) d{u, v) = ||u 


U ■ V 
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v/26\/74 ’ 
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V = -(3,4J) = 
= ||(-2,-7-3)|| 


/57 76 133\ _ /57 38 133\ 
V74’74’^74 ^) ~ 

= V4 + 49 + 9 = V62. 


1.13. Prove Theorem 1.2: For any u, v, w in R" and k in R: 


(i) 

(u + v) ■ w - 

= u • 

w + v ■ w, (ii) (ku) ■ v = k(u 

■ v), (iii) u ■ v — v ■ u, 

(iv) 

u ■ u > 0, and u ■ 

u = 0 iff u = 0. 



Let 

u= (u x ,u 2l ... 


v= (v u v 2 ,...,v ri 

,), w= (w 1 ,w 2 ,... 

,w n ). 

(i) 

Because u + v 

= ( u 

1+^1, u 2 + v 2i ■ 

u„ + V n ), 



(u 

+ v) 

■ W = (ll x + V x )Wi 

+ (u 2 + v 2 )w 2 + • ■ 

■ + (u„ + V n )w n 




= UyW x + V X W X 

+ u 2 w 2 + • • • + u n 

W n + V n w n 


= (Ui-Wx + u 2 w 2 H-F u„w„) + (V\W X + v 2 w 2 H-F v n w n ) 

= u ■ W + V ■ w 

(ii) Because ku = (ku t . ku 2 -... ,ku n ), 

( ku ) • v = ku l Vi + ku 2 v 2 + • • • + ku„v n = k(uiV l + u 2 v 2 + • • • + u n v n ) = k(u ■ v) 

(iii) u • v — iq v 1 + u 2 v 2 + • • • + u„v n = V \iq + v 2 u 2 + • • • + v n u n = v ■ u 

(iv) Because nf is nonnegative for each i, and because the sum of nonnegative real numbers is nonnegative, 

u • u — T Ui "F • • * ~F ii n )F 0 

Furthermore, u ■ u = 0 iff u t = 0 for each i, that is, iff u = 0. 
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1.14. Prove Theorem 1.3 (Schwarz): |u ■ v\ < ||M||||t/||. 

For any real number t, and using Theorem 1.2, we have 

0 < (tu+ v) ■ (tu + v) = t 2 (u ■ u ) + 2 t(u ■ v) + (v ■ v) = 11w11 2 f 2 + 2 (u ■ v)t + ||w|| 2 

Let a = \\u\\ 2 , b = 2 (u ■ v), c = ||t)|| 2 . Then, for every value of t, at 2 + bt + c > 0. This means that the 
quadratic polynomial cannot have two real roots. This implies that the discriminant D = b 2 — 4ac < 0 or, 
equivalently, b 2 < Aac. Thus, 

4(w v) 2 < 4||«|| 2 ||w || 2 

Dividing by 4 gives us our result. 

1.15. Prove Theorem 1.4 (Minkowski): ||u + t/|| < ||m|| + ||n||. 

By the Schwarz inequality and other properties of the dot product, 

\\u + t;|| 2 = (u + v) ■ (u + v) = (u ■ u ) + 2 (u ■ v) + (v ■ v) < ||«|| 2 + 2||u||||t;|| + ||?;|| 2 = (||u|| + ||z:||) 2 
Taking the square root of both sides yields the desired inequality. 

Points, Lines, Hyperplanes in R” 

Here we distinguish between an n-tuple P(a t . a 2 ...., a n ) viewed as a point in R" and an //-tuple 
u = [c 1 ,c 2 ,- ■ ■, c n \ viewed as a vector (arrow) from the origin O to the point Cfc, ,c 2 , ■ ■ ■ ,c n ). 

1.16. Find the vector u identified with the directed line segment PQ for the points: 

(a) P{\, -2,4) and 0(6,1,-5) in R 3 , (b) P{2, 3, -6,5) and 0(7,1,4, -8) in R 4 . 

(a) U =PQ = Q-P=[ 6-1, 1 — (—2), -5 - 4] = [5,3,-9] 

(b) u = ~PQ = Q-P= [7-2, 1-3, 4 + 6 , -8 - 5] = [5, -2,10,-13] 

1.17. Find an equation of the hyperplane H in R 4 that passes through P(3, —4,1, —2) and is normal to 
u = [2,5, —6, —3]. 

The coefficients of the unknowns of an equation of H are the components of the normal vector u. Thus, an 
equation of H is of the form 2rj + 5x 2 — 6 x 3 — 3v 4 = k. Substitute P into this equation to obtain k = —26. 
Thus, an equation of H is 2x 1 + 5.r 2 — 6 x 3 — 3x 4 = —26. 

1.18. Find an equation of the plane H in R 3 that contains Pi I . —3, —4) and is parallel to the plane PI' 
determined by the equation 3x — 6y + 5~ = 2. 

The planes H and H' are parallel if and only if their normal directions are parallel or antiparallel (opposite 
direction). Hence, an equation of H is of the form 3x — 6y + 5z = k. Substitute P into this equation to obtain 
k = 1. Then an equation of H is 3x — 6 v + 5z = 1. 

1.19. Find a parametric representation of the line L in R 4 passing through P(4, —2,3,1) in the direction 
of u = [2,5, —7,8]. 

Here L consists of the points X{x t ) that satisfy 

X = P + tu or x t = cijt + bf or L(t) = ( a + £>,) 
where the parameter t takes on all real values. Thus we obtain 
x l = 4 + 2f, x 2 = —2 + 2 1, x 2 = 3 — It, .*4=1 + 8 1 or L(t) = (4 + 2 1 , —2 + 2r, 3 — It, 1 + 8r) 
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1.20. Let C be the curve F(t) = (t 2 , 3 1 — 2, t 3 , f 2 + 5) in R 4 , where 0 < t < 4. 

(a) Find the point P on C corresponding to t — 2. 

(b) Find the initial point Q and terminal point Q' of C. 

(c) Find the unit tangent vector T to the curve C when t = 2. 

(a) Substitute t = 2 into F(t) to get P =/(2) = (4,4, 8, 9). 

(b) The parameter t ranges from t = 0 to t = 4. Hence, Q =/(0) = (0, — 2,0,5) and 

Q' = F( 4) = (16,10,64,21). 

(c) Take the derivative of F(t) —that is, of each component of F(t) —to obtain a vector V that is tangent to the 
curve: 


V(t)=—P2=[2t,3,3t\2t\ 

Now find V when t = 2; that is, substitute t = 2 in the equation for V(t) to obtain 
V = V{2) = [4, 3,12,4], Then normalize V to obtain the desired unit tangent vector T. We have 


||V|| = \/l6 + 9 + 144+ 16= v/185 and 


T = 


'4 3 12 4 ' 

\/l85 ’ \/l85 ’ 7^85 ’ 7T85_ 


Spatial Vectors (Vectors in R 3 ), ijk Notation, Cross Product 

1.21. Let u = 2i — 3j + 4k, v = 3i + j — 2k, w = i + 5j + 3k. Find: 

(a) u + v, (b) 2u — 3v + 4w, (c) u ■ v and u ■ w, (d) ||u|| and ||u| 

Treat the coefficients of i, j, k just like the components of a vector in R 3 . 

(a) Add corresponding coefficients to get u + v = 5i — 2j — 2k. 

(b) First perform the scalar multiplication and then the vector addition: 

2 it - 3v + 4w = (4i - 6j + 8k) + (-9i - 3 j + 6k) + (4i + 20j + 12k) 
= —i I 1 I j ■ 26k 

(c) Multiply corresponding coefficients and then add: 

u ■ v = 6 — 3 — 8= —5 and u ■ w = 2 — 15 + 12 = — 1 

(d) The norm is the square root of the sum of the squares of the coefficients: 

|M| = v/4 + 9+ 16 = v/29 and \\v\\ = V9 + 1 + 4 = v/l4 


1.22. Find the (parametric) equation of the line L: 

(a) through the points P(l,3,2) and Q( 2,5,—6); 

(b) containing the point F( 1,—2,4) and perpendicular to the plane H given by the equation 
3x + 5y + 7z — 15. 

(a) First find v = PQ = Q — P = [1,2, —8] = i + 2j — 8k. Then 

L(t) = (t + 1, 2t + 3, — 8 1 + 2) = (t + l)i + (2 1 + 3)j + (—8/ + 2)k 

(b) Because L is perpendicular to H, the line L is in the same direction as the normal vector N = 3i + 5j + 7k 
to H. Thus, 

L(t) = (3 1 +1, 5 1- 2, It + 4) = (3 1 + l)i + (5 1 - 2)j + (It + 4)k 

1.23. Let S be the surface xy 2 + 2yz = 16 in R 3 . 

(a) Find the normal vector N(v. y. -) to the surface S. 

(b) Find the tangent plane H to S at the point P( 1,2,3). 
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(a) The formula for the normal vector to a surface F(x,y,z) = 0 is 

N(x,y,z) = F x i + F y j + F z k 

where F x , F y , F z are the partial derivatives. Using F(x,y,z) = xy 2 + 2yz — 16, we obtain 
F x = y 2 , F y = 2xy + 2z, F z = 2y 

Thus, N(x, y, z) = y 2 i + (2xy + 2z)j + 2yk. 

(b) The normal to the surface S at the point P is 

N(P) = N(l, 2,3) = 4i + lOj + 4k 

Hence, N = 2i + 5j + 2k is also normal to S at P. Thus an equation of H has the form 2x + 5y + 2 z = c. 
Substitute P in this equation to obtain c = 18. Thus the tangent plane H to S at P is 2x + 5v + 2z = 18. 


1 . 24 . Evaluate the following determinants and negative of determinants of order two: 

3 4 2 -1 4 -5 

(a) (i) 5 , , (ii) 4 3 , (iii) 3 _ 2 

(b) 0 - n . 00 ®.) -1 z\ 


Use 


a b 
c d 


= ad — be and 


a b 
c d 


= be — ad. Thus, 


(a) (i) 27 - 20 = 7, (ii) 6 + 4 = 10, (iii) -8 + 15 = 7. 

(b) (i) 24 - 6 = 18, (ii) -15 - 14 = -29, (iii) -8 + 12 = 4. 


1 . 25 . Let u = 2i — 3j + 4k, v = 3i + j — 2k, w = i + 5j + 3k. 

Find: (a) u x v, (b) u x w 

(a) Use 3 j 9 to get u x v = (6 — 4)i + (12 + 4)j + (2 + 9)k = 2i + 16j + Ilk. 

(b) Use “ | | to get u x w = (—9 — 20)i + (4 — 6 )j + (10 + 3)k = —291 — 2j + 13k. 

1 . 26 . Find u x v, where: (a) u — (1,2,3), v — (4,5,6); (b) u — (— 4,7,3), v — (6, — 5,2). 

(a) Use 4 j g to get u x v = [12 — 15, 12 — 6 , 5 — 8 ] = [—3, 6 , —3]. 

(b) Use _ 7 5 ^ to get u x v= [14+ 15, 18 + 8 , 20-42] = [29,26,-22], 

1 . 27 . Find a unit vector u orthogonal to v = [1,3,4] and w — [2, — 6 , —5]. 

First find v x w, which is orthogonal to v and w. 

The array t g 5 gives v x w = [—15 + 24, 8 + 5, — 6 — 61] = [9,13, —12], 

Normalize v x w to get u = [9/a/394, 13/-/394, —12/\/394]. 

1 . 28 . Let u — (a l ,a 2 ,a 3 ) and v — (b l ,b 2 ,bj > ) so u x v — (a 2 b 2 — a 2 b 2 ,a 2 b l — a^b i ,a l b 2 — a 2 b l ). 
Prove: 

(a) a x v is orthogonal to u and v [Theorem 1.5(a)], 

(b) ||u x d || 2 = (u ■ u)(v ■ v ) — (,u ■ v)~ (Lagrange’s identity). 
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(a) We have 

u ■ (u x v) = a l (a 2 b 3 — a 3 b 2 ) + a 2 (a 3 b l — a^b 3 ) + a 2 (a l b 2 — a 2 b\) 

= ci\a 2 b 3 — ciia 2 b 2 + a 2 a 3 b j — a i a 2 b 3 + a l a 3 b 2 — a 2 a 3 b\ = 0 
Thus, u x v is orthogonal to u. Similarly, u x v is orthogonal to v. 

(b) We have 

\\u x ?/|| 2 = (a 2 b 3 - a 3 b 2 f + (a 3 b l - ai b 3 ) 2 + (tf,Z > 2 - a 2 b \) 2 ( 1 ) 

(n ■ u){v ■ v ) — (u ■ v) 2 = ( a 2 + a\ + a 2 )(bj + b\ + b\) — + a 2 b 2 + a 3 b 3 y ( 2 ) 

Expansion of the right-hand sides of (1) and (2) establishes the identity. 

Complex Numbers, Vectors in C” 

1.29. Suppose z = 5 + 3i and w — 2 — 4 i. Find: (a) z + w, (b) z — w, (c) zvv. 

Use the ordinary rules of algebra together with i 2 = — 1 to obtain a result in the standard form a + bi. 

(a) z + w = (5 + 3 i) + (2 - 4i) = 7 - i 

(b) z ~ w = (5 + 3i) - (2 - 4 i) = 5 + 3i - 2 + 4i = 3 + li 

(c) zw =(5 + 3i) (2 - 4 i) = 10 - 14/ - 12r = 10 - 14/ + 12 = 22 - 14/ 

1.30. Simplify: (a) (5 + 3i)(2 — li), (b) (4 — 3 i) 2 , (c) (1 + 2/) 3 . 

(a) (5 + 3/)(2 - li) = 10 + 6 / - 35/ - 21 / 2 = 31 - 29/ 

(b) (4 - 3 /) 2 = 16 - 24/ + 9 i 2 = 7 - 24/ 

(c) (1 + 2/) 3 = 1 + 6 /+ 12 Z 2 + 8/ 3 = 1 + 6 / — 12 — 8 / = — 11 — 2 / 

1.31. Simplify: (a) /°, i\ i\ (b) i 5 , i 6 , f, / 8 , (c) r 39 , / 174 , i 252 , i 317 . 

(a) /° = 1 , / 3 = r(/) = (- 1 )(/) = -/, i 4 = (/ 2 )(/ 2 ) = (- 1 )(- 1 ) = 1 

(b) / 5 = (/*)(/) = ( 1 )(/) = /, i 6 = (/ 4 )(/ 2 ) = ( 1 )(/ 2 ) = / 2 = - 1 , f = / 3 = -/, / 8 = / 4 = 1 

(c) Using / 4 = 1 and i" = i 4q+r = (i 4 ) q i r = 1 q i r = i r , divide the exponent n by 4 to obtain the remainder r: 

.39 = ; 4(9)+3 = (i 4 } 9 ; .3 = ^-3 = f 3 = ,-174 = (2 = _1 . f252 = ,0 = j /3 17 = f l = 

1.32. Find the complex conjugate of each of the following: 

(a) 6 + 4 i, 1 — 5 i, 4 + i, —3 — i, (b) 6 , —3, 4/, —9/. 

(a) 6 + 4/ = 6 — 4/, 7 — 5/ = 7 + 5/, 4 + / = 4 — /, —3 — /= —3 + / 

(b) 6 = 6 , =3 = -3, 4/ = -4/, ^9/ = 9/ 

(Note that the conjugate of a real number is the original number, but the conjugate of a pure imaginary 
number is the negative of the original number.) 

1.33. Find zz and z| when z = 3 + 4i. 

For z = a + bi, use zz = a 2 + b 2 and z = \[iz. = Va 2 + b 2 . 

z - z = 9+16 = 25, |z| = v/25 = 5 

2 7 * 

1.34. Simpify^-^. 

5 + 3/ 

To simplify a fraction z/w of complex numbers, multiply both numerator and denominator by w, the 
conjugate of the denominator: 

2 — 7/ _ (2 — 7/)(5 — 3/) —11 — 41/ _ 11 41. 

5 + 3/ (5 + 3/)(5 — 3/) _ 34 “ _ 34 _ 34' 
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1.35. Prove: For any complex numbers z, w G C, (i) z + w = z + w, (ii) zw = zw, (iii) z = z. 

Suppose z = a + bi and w = c + di where a, b,c,d G R. 

(i) z+w= (a + bi) + (c + di) = (a + c) + (b + d)i 

= (a + c) — (b + d)i = a + c — bi — di 

= (a — bi) + (c — di) = z + w 

(ii) zw = (a + bi)(c + di) = (ac — bd) + (ad + bc)i 

= (ac — bd) — (ad + bc)i = (a — bi)(c — di) = zw 

(iii) z = a + bi = a — bi = a — (-b)i = a + bi = z 

1.36. Prove: For any complex numbers z, w € C, \zw\ = |z||w|. 

By (ii) of Problem 1.35, 

|zw| 2 = (zw)(zyv) = (zw){zw) = (zz)(ww) = |z| 2 |vr| 2 

The square root of both sides gives us the desired result. 

1.37. Prove: For any complex numbers z, w € C, \z + w\ < |z| + |w|. 

Suppose z = a + bi and w = c + di where a. b,c,d G R. Consider the vectors u = (a, b) and v = (c, d) in 
R 2 . Note that 

|z| = \/ a 2 + b 1 = ||m||, | w| = \/ c 2 + d 2 = ||v|| 

and 

\z + w\ = |(a + c) + (b + d)i\ = \J (a + c ) 2 + (b + d) 2 = \\(a + c,b + d)\\ = || u+ v|| 

By Minkowski’s inequality (Problem 1.15), \W + v\\ < || m|| + ||u||, and so 

|z + w| = ||w + k|| < ||n|| + ||w|| = |z| + |w| 

1.38. Find the dot products u ■ v and v ■ u where: (a) u — (1 — 2 i, 3 + i), v = (4 + 2 i, 5 — 6 i), 
(b) u = (3 — 2 i, 4 i, 1 + 6 /), v — (5 + i, 2 — 3 i, 7 + 2 i). 

Recall that conjugates of the second vector appear in the dot product 

(zi, . . • ,Z„) • (w lt . . . ,W„) = ZjWj + • • • + z n w n 

(a) u ■ v = (1 — 2t)(4 + 2 i) + (3 + t)(5 — 6 i) 

= (1 - 2i)(4 - 20 + (3 + 0(5 + 60 = ‘-10/ +9 + 23/ = 9 + 13/ 

v ■ u = (4 + 2/)(l — 2/) + (5 — 6 /) (3 + Z) 

= (4 + 2/)(l + 2/) + (5 - 6/)(3 - /) = 10/ + 9-23/ = 9 - 13/ 


(b) u ■ v = (3 - 2/) (5 + /) + (4/) (2 - 3/) + (1 + 6 /) (7 + 2/) 

= (3 - 2/)(5 - /) + (4/) (2 + 3/) + (1 + 6/)(7 - 2/) = 20 + 35/ 

v ■ u = (5 + 0(3^2/) + (2 - 3/)(47) + (7 + 20(1 +67) 

= (5 + /)(3 + 2/) + (2 - 3/)(—4/) + (7 + 2/)(l - 6 /) = 20 - 35/ 

In both cases, v ■ u = u ■ v. This holds tme in general, as seen in Problem 1.40. 

1.39. Let a = (1 — 2i, 2 + 5/) and v — (1 + i, —3 — 6 /). Find: 

(a) u + v, (b) 2 iu, (c) (3 — i)v, (d) u ■ v, (e) ||n|| and ||n 

(a) u + v = (7 — 2/ + 1 + /, 2 + 5/ — 3 — 6 /) = (8 — Z, — 1 — /) 

(b) 2 iu = (14/ — 4/ 2 , 4/ + 10/ 2 ) = (4 + 14/, —10 + 4/) 

(c) (3 — i)v = (3 + 3/ — / — t 2 , —9 — 18/ + 3/ + 6 r) = (4 + 2/, —15 — 15/) 
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(d) u ■ v = (7 — 2i)( 1 + i) + (2 + 5/)(—3 — 6 /) 

= (7 - 2/)(l - 0 + (2 + 5i)(—3 + 6 «) = 5 - 9/ - 36 - 3/ = -31 - 12/ 

(e) ||u|| = \Jl 2 + (-2) 2 + 2 2 + 5 2 = \/82 and ||v|| = y^l 2 + l 2 + (-3 ) 2 + (- 6) 2 = 

1.40. Prove: For any vectors u, v £ C" and any scalar z £ C, (i) u ■ v = v ■ u, (ii) (zw) ■ v — z(u ■ v), 
(iii) u ■ (zv) — z(u ■ v). 

Suppose u = (z u z 2 , ■ ■ -,Z„) and v = (w u w 2 ,.. .,w„). 

(i) Using the properties of the conjugate, 

ITu = wjz, + w 2 z 2 H-F w„z„ = wjz, + w 2 z 2 H-1- w„z n 

= w 1 z l + w 2 z 2 H-h w„z„ = Z!Vt>! + z 2 w 2 H-f z n w n =u-v 

(ii) Because zu = {zz\ ,zz 2 , • • ■, zz n ), 

(■ zu ) • v = zz 1 w 1 + zz 2 w 2 H-h zz„w n = z{z x w x + z 2 w 2 H-b z n w„) = z(u ■ v) 

(Compare with Theorem 1.2 on vectors in R".) 

(iii) Using (i) and (ii). 


u ■ (zv) = (zv) ■ u = z(v ■ u) = z(v ■ u) = z(u ■ v) 


SUPPLEMENTARY PROBLEMS 


Vectors in R n 

1 . 41 . Let u = (1, —2,4), v = (3,5,1), w = (2,1, —3). Find: 

(a) 3u — 2v\ (b) 5u + 3v — 4w; (c) u ■ v, u ■ w, v-w; (d) ||u||, ||v||, ||w||; 

(e) cos 6, where 9 is the angle between u and v; (f) d(u,v)\ (g) proj(w, v). 



r ii 


'2' 


3' 

1 . 42 . Repeat Problem 1.41 for vectors u = 

' i 

4^ UJ 

, V = 

1 

5 

, w — 

-2 

6 


1 . 43 . Let u = (2, —5,4, 6 , —3) and v = (5, —2,1, —7, —4). Find: 

(a) 4m — 3d; (b) 5m + 2d; (c) u ■ v; (d) ||m|| and ||d||; (e) proj(M, d); (f ) d(u,v). 

1 . 44 . Normalize each vector: 

(a) u = (5, —7); (b) v = (1,2, -2,4); (c) w=Q,-^,^). 

1 . 45 . Let u = (1,2, —2), v = (3, —12,4), and k = —3. 

(a) Find ||m||, J|d||, |m + d||, \\ku\\. 

(b) Verify that ||fa(|| = |/j|||m|| and \\u + d|| < ||m|| + ||d||. 

1 . 46 . Find x and y where: 

(a) (x, y+ 1 ) = (y~2, 6 ); (b) x(2,y) = y(l,- 2 ). 


1 . 47 . Find x, y, z where (x, y+1, y + z) = (2x + y, 4, 3z). 
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1.48. Write v = (2,5) as a linear combination of u x and u 2 , where: 

(a) m, = (1,2) and u 2 = (3,5); 

(b) w, = (3,-4) and u 2 = (2,-3). 



9' 


V 


2 ' 


4' 

1.49. Write v = 

-3 

16 

as a linear combination of iq = 

2 

3 

, u 2 = 

5 

-1 

, W 3 = 

-2 

3 


1.50. Find k so that u and v are orthogonal, where: 

(a) w = (3,fc, —2), t> = (6, —4, —3); 

(b) u = (5, k , —4,2), v = (1,-3,2,2 k)\ 

(c) u — (1, 7, k + 2, — 2), v = (3, k, — 3, k). 

Located Vectors, Hyperplanes, Lines in R n 

1.51. Find the vector v identified with the directed line segment PQ for the points: 

(a) P(2, 3, —7) and 2(1, — 6 , —5) in R 3 ; 

(b) P( 1, - 8 , -4, 6 ) and 2(3, -5,2, -4) in R 4 . 

1.52. Find an equation of the hyperplane H in R 4 that: 

(a) contains P(l, 2, —3,2) and is normal to u = [2, 3, —5, 6 ]; 

(b) contains P(3, — 1,2,5) and is parallel to 2x\ — 3x 2 + 5x 3 — 7x 4 = 4. 

1.53. Find a parametric representation of the line in R 4 that: 

(a) passes through the points P(l,2,1,2) and 2(3,—5, 7,—9); 

(b) passes through P(l, 1, 3, 3) and is perpendicular to the hyperplane 2xj + 4x 2 + 6 x 3 — 8 x 4 = 5. 

Spatial Vectors (Vectors in R 3 ), ijk Notation 

1.54. Given u = 3i — 4j + 2k, v = 2i + 5j — 3k, w = 4i + 7j + 2k. Find: 

(a) 2u — 3v\ (b) 3u + 4v — 2w, (c) u ■ v, u-w, v-w; (d) ||w||, ||d||, (|w||. 

1.55. Find the equation of the plane H: 

(a) with normal N = 3i — 4j + 5k and containing the point P(l,2, —3); 

(b) parallel to 4x + 3y — 2z= 11 and containing the point 2(2, — 1,3). 

1.56. Find the (parametric) equation of the line L: 

(a) through the point P( 2,5, —3) and in the direction of v = 4i — 5j + 7k; 

(b) perpendicular to the plane 2x — 3y + lz = 4 and containing P(l, —5,7). 

1.57. Consider the following curve C in R 3 where 0 < t < 5: 

F(t) = r 3 i - r 2 j + (2 1 - 3)k 

(a) Find the point P on C corresponding to t = 2. 

(b) Find the initial point Q and the terminal point Q'. 

(c) Find the unit tangent vector T to the curve C when t = 2. 

1.58. Consider a moving body B whose position at time t is given by R(t) = ? 2 i + f 3 j + 2fk. [Then V(t) = dR(t)/dt 
and A(t) = dV(t)/dt denote, respectively, the velocity and acceleration of B.] When t = 1, find for the 
body B: 

(a) position; (b) velocity v; (c) speed s; (d) acceleration a. 
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1.59. Find a normal vector N and the tangent plane H to each surface at the given point: 

(a) surface x 2 y + 3 yz = 20 and point P(l, 3,2); 

(b) surface x 2 + 3y 2 — 5z 2 = 160 and point P( 3, —2,1). 


Cross Product 

1.60. Evaluate the following determinants and negative of determinants of order two: 


(a) 

(b) 


2 5 


3 -6 


(N 

1 

'xf 

1 

3 6 

5 

1 -4 


7 -3 


6 4 


1 -3 


OO 

1 

U) 

7 5 


2 4 


-6 -2 


1.61. Given u = 3i — 4j + 2k, v = 2i + 5j — 3k, w = 4i + 7j + 2k, find: 

(a) u x v, (b) u x w, (c) v x w. 

1.62. Given u = [2,1,3], v = [4, -1,2], w = [1, 1,5], find: 

(a) u x v, (b) u x w, (c) v x w. 


1.63. Find the volume V of the parallelopiped formed by the vectors u, v, w appealing in: 
(a) Problem 1.61 (b) Problem 1.62. 


1.64. Find a unit vector u orthogonal to: 

(a) v = [1,2,3] and w = [1, —1,2]; 

(b) v = 3i — j + 2k and w = 4i — 2j — k. 


1.65. Prove the following properties of the cross product: 


(a) u x v — —(v x u) 

(b) u x it = 0 for any vector u 

(c) ( ku ) x v = k(u x v) = u x ( kv ) 

Complex Numbers 

1.66. Simplify: 

(a) (4-7i)(9 + 2i); (b) (3 — 5/) 2 ; 

1.67. Simplify: (a) (b) 


(d) MX (v + w) 


(e) (ri + vv) 

X u - 

(f) (u X v ) 

X w 

1 


1 

'x|- 

'Tj 

(d) 

(C) i 15 , i 25 , / 34 ; 

(d) 


(u x v) + (n x w) 
(v x u) + (w x u) 
(it ■ w)v — (v ■ w)u 


9 + 2 i 
3 - 5 i' 


(e) (l-if. 
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1.68. Let z = 2 — 5i and w = 1 + 3 i. Find: 

(a) v + w; (b) zw; (c) z/w. (d) z,w; (e) |z|, |w|. 


1.69. Show that for complex numbers z and vv: 

(a) Rez = ^(z + z), (b) Imz = |(z —z), (c) zw = 0 implies z = 0 or vv = 0. 


Vectors in C" 

1.70. Let u = (1 + li, 2 — 6 i) and v = (5 — 2/, 3 — 4 i). Find: 

(a) u + v (b) (3 + i)u (c) 2iu + (4 + 7;)i> (d) u ■ v (e) |]u|| and ||u| 
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1.71. Prove: For any vectors u, v. w in C": 

(a) (,u + v) ■ w = u ■ w + v ■ w, (b) w ■ ( u + v) = w ■ u + w ■ v. 

1.72. Prove that the norm in C" satisfies the following laws: 

[Nj] For any vector u, ||m|| > 0 ; and ||«|| = 0 if and only if u = 0 . 

[N 2 ] For any vector u and complex number z, \\zu\\ = |z|||k||. 

[N 3 ] For any vectors u and v, \\u + w|| < ||u|| + ||w||. 


ANSWERS TO SUPPLEMENTARY PROBLEMS 


1.41. (a) (-3,-16,10); (b) (6,1,35); (c) -3,-12, 8 ; (d) x/21, ^35, x/l4; 

(e) — 3/\/2T\/35; (f) v 7 ^; (g) - £ (3,5, 1 ) = (-- *§, -1) 

1.42. (Column vectors) (a) (—1,7,—22); (b) (-1,26,-29); (c) -15,-27,34; 

(d) 726,^; (e) -\5 / (\/26\/30y, (f) v 7 ^; (g) = (-1,-§) 

1.43. (a) (-7,-14,13,45,0); (b) (20,-29,22,16,-23); (c) - 6 ; (d) v 7 ^),a/95; 

(e) -!«; (f) x/197 

1.44. (a) (5/v/74, -7/ v / 74); (b) (±, =, - = , f); (c) (6/v/l33, —4/>/l33, 9/^133) 

1.45. (a) 3, 13, VT20, 9 

1.46. (a) jc = 3, y = 5; (b) x = 0, y = 0, and x = —2, _y = —4 

1.47. x = -3, y = 3, z = § 

1.48. (a) v = 5uj — u 2 , (b) v = 16«] — 23m 2 

1.49. v = 3 Hi — u 2 + 2m 3 

1.50. (a) 6 ; (b) 3; (c) | 

1.51. (a) tt = [—1, —9,2]; (b) [2,3, 6 ,-10] 

1.52. (a) 2xj + 3x 2 — 5x 3 + 6x 4 = 35; (b) 2x 1 — 3x 2 + 5x 3 — 7x 4 = —16 

1.53. (a) [2 1 + 1, —It + 2, 6 f + 1, — lit + 2]; (b) [2 1 “hi, At -h 1, 6 ^ —I— 3, — 8 1 -h 3] 

1.54. (a) —23j + 13k; (b) 91 — 6 j — 10k; (c) -20,-12,37; (d) y/29, v/38, ^69 

1.55. (a) 3x — 4_v + 5z = — 20; (b) 4x + 3y — 2z= — 1 

1.56. (a) [At+ 2, —5t + 5, 7/ — 3]; (b) [2f+l, -3r-5, 7r + 7] 

1.57. (a) P = F( 2) = 8i - 4j + k; (b) Q = F( 0) = -3k, £' = F(5) = 1251 - 25j + 7k; 

(c) T = ( 6 i — 2j + k)/V^T 

1.58. (a) i + j + 2k; (b) 2i + 3j + 2k; (c) vT7; (d) 2i + 6 j 


1.59. (a) N = 6i + 7j + 9k, 6 x + 7y + 9z = 45; (b) N = 6i — 12j — 10k, 3x — 6 y — 5z = 16 
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1.60. (a) -3,-6,26; (b) -2,-10,34 

1.61. (a) 2i+13j + 23k; (b) —22i + 2j + 37k; (c) 31i — 16j — 6k 

1.62. (a) [5,8,-6]; (b) [2,-7,1]; (c) [-7,-18,5] 

1.63. (a) 145; (b) 17 

1.64. (a) (7, 1,—3)/v / 59; (b) (5i + 11 j - 2k)//f50 

1.66. (a) 50 - 551; (b) -16-301; (c) ^ (4 + 71); (d) ^ (1 +31); (e) —2 — 21 

1.67. (a) — 11; (b) ^ (5 + 271); (c) -1,1,-1; (d) i(4 + 31) 

1.68. (a) 9-21; (b) 29 - 291; (c) i (-1-411); (d) 2 + 51, 7-31; (e) y/29, v 7 ^ 

1.69. (c) Hint: If zw = 0, then \zw\ = |z||w| = |0| = 0 

1.70. (a) (6 + 51, 5-101); (b) (-4 + 221, 12-161); (c) (20 + 291, 52 + 91); 

(d) 21 + 271; (e) v 7 ^), \^4 



CHAPTER 2 



Algebra of Matrices 


2.1 Introduction 


This chapter investigates matrices and algebraic operations defined on them. These matrices may be 
viewed as rectangular arrays of elements where each entry depends on two subscripts (as compared with 
vectors, where each entry depended on only one subscript). Systems of linear equations and their 
solutions (Chapter 3) may be efficiently investigated using the language of matrices. Furthermore, 
certain abstract objects introduced in later chapters, such as “change of basis,” “linear transformations,” 
and “quadratic forms,” can be represented by these matrices (rectangular arrays). On the other hand, the 
abstract treatment of linear algebra presented later on will give us new insight into the structure of these 
matrices. 

The entries in our matrices will come from some arbitrary, but fixed, field K. The elements of K are 
called numbers or scalars. Nothing essential is lost if the reader assumes that K is the real field R. 


2.2 Matrices 


A matrix A over a field K or, simply, a matrix A (when K is implicit) is a rectangular array of scalars 
usually presented in the following form: 


a \\ 

a l2 ■ 

■ ■ a \n 

a 2\ 

a 22 

■ ■ a-in 

a m\ 

a m2 ■ 

* • ®mn 


The rows of such a matrix A are the m horizontal lists of scalars: 

(a n ,a n , ■ ■ ■ , a \n)i (a 2 i, a 22, ■ ■ ■ , a 2 n)i \ a ml> a m2> • ■ • i a mn) 

and the columns of A are the n vertical lists of scalars: 


a u 


a \2 


a \n 

a 2l 

5 

a 22 

5 • • * 5 

a 2n 

_ a m\ . 


_ a m2 . 


_ ^mn _ 


Note that the element ay, called the ij-entry or ij-element, appears in row i and column j. We frequently 
denote such a matrix by simply writing A — [ay]. 

A matrix with m rows and n columns is called an m by n matrix, written in x n. The pair of numbers m 
and n is called the size of the matrix. Two matrices A and B are equal, written A = B, if they have the same 
size and if corresponding elements are equal. Thus, the equality of two m x n matrices is equivalent to a 
system of mn equalities, one for each corresponding pair of elements. 

A matrix with only one row is called a row matrix or row vector, and a matrix with only one column is 
called a column matrix or column vector. A matrix whose entries are all zero is called a zero matrix and 
will usually be denoted by 0. 
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Matrices whose entries are all real numbers are called real matrices and are said to be matrices over R. 
Analogously, matrices whose entries are all complex numbers are called complex matrices and are said to 
be matrices over C. This text will be mainly concerned with such real and complex matrices. 


EXAMPLE 2.1 

(a) The rectangular array A 
and its columns are 


1 -4 
0 3 


5 

-2 


is a 2 x 3 matrix. Its rows are (1, —4,5) and (0,3, —2), 



(b) The 2x4 zero matrix is the matrix 0 = 

(c) Find x, y, z, t such that 


0 0 0 
0 0 0 


x + y 2 z + t 


'3 7' 

_x-y z - t _ 


1 5 . 


By definition of equality of matrices, the four corresponding entries must be equal. Thus, 


x + _y = 3, x — y — 1, 2z + t = 1, z — t = 5 


Solving the above system of equations yields x = 2, y = 1, z — 4, t — — 1. 


2.3 Matrix Addition and Scalar Multiplication 

Let A = [ay] and B = [by] be two matrices with the same size, say m x n matrices. The sum of A and B. 
written A + B, is the matrix obtained by adding corresponding elements from A and B. That is, 


flu + b u 

a n + b n 

.. a ln +b 

a n + bn 

a 22 + b 2 2 

■ ■ a 2n + b. 

T b m i 

n2 T bf n 2 

■ ■ a mn + b, 


The product of the matrix A by a scalar k, written k ■ A or simply kA, is the matrix obtained by multiplying 
each element of A by k. That is. 



ka n 

ka l2 ■ ■ ■ 

ka hl 


kA — 

ka 2l 

ka 22 

ka 2n 



_ka m i 

ka m2 •• • 

kfl 

mn _ 


Observe that A + B 

and kA are 

also m 

x n matrices. We also define 

-A = 

(-l)A 

and 

A-B 

= A+(-B) 


The matrix —A is called the negative of the matrix A, and the matrix A — B is called the difference of A and 
B. The sum of matrices with different sizes is not defined. 
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EXAMPLE 2.2 LetA = 


-2 

4 


3 

5 


and B = 


4 

1 


6 

-3 


8 

-7 


. Then 


'1+4 

-2 + 6 

3 + 8 


'5 4 11' 

_°+ 1 

4 +(-3) 

5+ (-?)_ 


! 1 —2_ 


'3(1) 

3(" 2) 

3(3)' 


'3 -6 9' 

.3(0) 

3(4) 

3(5) _ 


1 

in 

<N 

O 

_i 



'2 

-4 

6 ' 


'-12 

-18 

-24' 


"-10 

-22 

-18' 

£ 

i 

Ck) 

to 

II 

0 

8 

10 . 

+ 

-3 

9 

21 

— 

-3 

17 

31 


The matrix 2A — 35 is called a linear combination of A and B. 


Basic properties of matrices under the operations of matrix addition and scalar multiplication follow. 
THEOREM 2.1: Consider any matrices A, B, C (with the same size) and any scalars k and k!. Then 


(i) 

(A + B) + C - A + (B + C), 

(v) 

k(A + B) = kA + kB, 

(ii) 

a + o = o + a = a. 

(vi) 

(k + k')A — kA + k'A, 

(iii) 

A + (—A) = (—A) + A = 0, 

(vii) 

II 

(iv) 

A + B = B + A, 

(viii) 

1 - A = A. 


Note first that the 0 in (ii) and (iii) refers to the zero matrix. Also, by (i) and (iv), any sum of matrices 
Ai + A 2 H + A n 

requires no parentheses, and the sum does not depend on the order of the matrices. Furthermore, using (vi) 
and (viii), we also have 

A + A = 2A, A + A + A = 3A, 

and so on. 

The proof of Theorem 2.1 reduces to showing that the (/-entries on both sides of each matrix equation 
are equal. (See Problem 2.3.) 

Observe the similarity between Theorem 2.1 for matrices and Theorem 1.1 for vectors. In fact, the 
above operations for matrices may be viewed as generalizations of the corresponding operations for 
vectors. 


2.4 Summation Symbol 

Before we define matrix multiplication, it will be instructive to first introduce the summation symbol X (the 
Greek capital letter sigma). 

Suppose f(k) is an algebraic expression involving the letter k. Then the expression 

E/(*) or equivalently £Li f(k) 

k= 1 

has the following meaning. First we set k — 1 in f(k), obtaining 

/(l) 

Then we set k — 2 in f(k), obtaining/(2), and add this to/(l), obtaining 

/(l) +/(2) 
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Then we set k = 3 in f(k), obtaining /(3), and add this to the previous sum, obtaining 

/(l) +/(2) +/(3) 

We continue this process until we obtain the sum 

/(l) +/(2) + • • • +/(«) 

Observe that at each step we increase the value of k by 1 until we reach n. The letter k is called the index, 
and 1 and n are called, respectively, the lower and upper limits. Other letters frequently used as indices are 
i and j. 

We also generalize our definition by allowing the sum to range from any integer n , to any integer n 2 . 
That is, we define 

E f(k) = f{ n i) +f( n i + 1 ) +f( n l + 2 ) + ■ • • +f(n 2 ) 

k=ti\ 


EXAMPLE 2.3 


0 n 

(a) E x k = x i + x 2 +X 3 +X 4 + x 5 and X) a,b, = a l b 1 + a 2 b 2 H-h a n b n 

k= 1 i= 1 

5 n 

(b) XE 2 = 2 2 + 3 2 + 4 2 + 5 2 = 54 and a i x ' = a o + a i x + a 2 x2 + • • ■ + a n x" 


j= 2 


i =0 


(c) Y2 a ik b kj = <*nhj + “nJhj + a B b 3j + • • • + a ip b 
k=l 


PJ 


2.5 Matrix Multiplication 


The product of matrices A and B, written AB, is somewhat complicated. For this reason, we first begin with 
a special case. 

The product AB of a row matrix A — [a,] and a column matrix B = [/?,] with the same number of 
elements is defined to be the scalar (or 1 x 1 matrix) obtained by multiplying corresponding entries and 
adding; that is, 


AB = [a u a 2 ,. 


b\ 



— ciyby + a 2 b 2 + ■ ■ ■ + a n b n 


n 


E a k b k 

k= 1 


We emphasize that AB is a scalar (or a 1 x 1 matrix). The product AB is not defined when A and B have 
different numbers of elements. 


EXAMPLE 2.4 

(a) [7,-4,5] 

(b) [ 6 ,- 1 , 8 ,3] 


3 

2 

-1 


4 
-9 
-2 

5 


7(3) + (—4)(2) 4 5( 1) -21- 8 5 8 


= 24 + 9- 16 + 15 = 32 


We are now ready to define matrix multiplication in general. 
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DEFINITION: Suppose A = [a jk ] and B = [b kj ] are matrices such that the number of columns of A is 
equal to the number of rows of B\ say, A is an m x p matrix and B is ap x n matrix. Then 
the product AB is the m x n matrix whose //-entry is obtained by multiplying the ith row 
of A by the yth column of B. That is. 


-1 

a lp 

a n 

Clip 

_ a ml 

®mp 


Cii ... Ci 


where c tj = a n b }j + a i2 b 2j + ■ • • + a ip b pj = a ik b kj 

k= 1 

The product AB is not defined if A is an m x p matrix and B is a q x n matrix, where p =/=■ q. 

EXAMPLE 2.5 

(a) Find AB where A = ^ | and B = ^ 2 g • 

Because A is 2 x 2 and B is 2 x 3, the product AB is defined and AB is a 2 x 3 matrix. To obtain 
the first row of the product matrix AB, multiply the first row [1, 3] of A by each column of B, 


2 0-4 

5 ’ -2 ’ 6 


respectively. That is, 

_ f2 + 15 0-6 -4+181 f 17 -6 14 


To obtain the second row of AB, multiply the second row [2,-1] of A by each column of B. Thus, 

f 17 -6 14 1 r 17 -6 14] 

AB=l 4-5 0 + 2 —8 — 6 = ' ~ 


17 -6 14 

-1 2 -14 


(b) Suppose A = ^ 

2 ] 

, and B = 
4 

'5 

0 

6 ' 

-2 

. Then 




r 5 + 0 

AB [ 15 + 0 

6-4] 

5 

2 

and 

BA- [ 5 + 18 
BA [ 0-6 

10 + 24' 

[ 23 34] 

18 — 8 J 

15 

10 

0-8 

OO 

1 

so 

1 


The above example shows that matrix multiplication is not commutative—that is, in general, AB ^ BA. 
However, matrix multiplication does satisfy the following properties. 

THEOREM 2.2: Let A,B, C be matrices. Then, whenever the products and sums are defined, 

(i) ( AB)C = A(BC) (associative law), 

(ii) A(B AC) — AB + AC (left distributive law), 

(iii) (B + C)A = BA + CA (right distributive law), 

(iv) k(AB) — (kA)B = A(kB), where k is a scalar. 

We note that 0A = 0 and BO — 0, where 0 is the zero matrix. 
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2.6 Transpose of a Matrix 

The transpose of a matrix A, written A T , is the matrix obtained by writing the columns of A, in order, as 
rows. For example. 



T 

'1 4' 


r 

'12 3' 
4 5 6 


2 5 

3 6 

and [1, —3, —5] r = 

-3 

-5 


In other words, if A — [ay] is an in x n matrix, then A T = [by] is the n x m matrix where by = a jr 

Observe that the tranpose of a row vector is a column vector. Similarly, the transpose of a column vector 
is a row vector. 

The next theorem lists basic properties of the transpose operation. 

THEOREM 2.3: Let A and B be matrices and let k be a scalar. Then, whenever the sum and product are 
defined, 

(i) (A + B) t = A t + B t , (iii) (kA) T = kA T , 

(h) (A t ) t =A, (iv) (AB) t = B t A t . 

We emphasize that, by (iv), the transpose of a product is the product of the transposes, but in the reverse 
order. 


2.7 Square Matrices 


A square matrix is a matrix with the same number of rows as columns. An n x n square matrix is said to 
be of order n and is sometimes called an n-square matrix. 

Recall that not every two matrices can be added or multiplied. However, if we only consider square 
matrices of some given order n, then this inconvenience disappears. Specifically, the operations of 
addition, multiplication, scalar multiplication, and transpose can be performed on any n x n matrices, and 
the result is again an n x n matrix. 


EXAMPLE 2.6 The following are square matrices of order 3: 



1 

2 

3' 



'2 

-5 

r 

A = 

-4 

-4 

-4 

and 

B = 

0 

3 

-2 


5 

6 

7 



1 

2 

-4 


The following are also matrices of order 3: 


[3-3 4' 


1 

VO 

^1" 

<N 

i_ 


1 

LT) 

T ^t 

1 

1 _ 

1 

1 

ON 

, 2A = 

00 

1 

00 

1 

oo 

1 

II 

2-4 6 

1 - 

G\ 

00 

u> 

1 _ 


10 12 14 


3-4 7 


5 

7 

-15' 


27 

30 

33' 

-12 

0 

20 

, BA = 

-22 

-24 

-26 

17 

7 

-35 _ 


-27 

-30 

-33 _ 


Diagonal and Trace 

Let A = [ay] be an u-square matrix. The diagonal or main diagonal of A consists of the elements with the 
same subscripts—that is. 


a 


nn 


a Ui a 22i fl 33> 
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The trace of A, written tr(A), is the sum of the diagonal elements. Namely, 
tr(A) = c/ n + a 2 i + «33 + • • • + a nn 
The following theorem applies. 

THEOREM 2.4: Suppose A = [aJ and B — [/;,,] are /(-square matrices and k is a scalar. Then 

(i) tr(A + £) = tr (A) + tr(B), (iii) tr(A r ) = tr(A), 

(ii) tr(fc4) = k tr(A), (iv) tr(AB) = tr(BA). 

EXAMPLE 2.7 Let A and B be the matrices A and B in Example 2.6. Then 

diagonal of A = {1, —4,7} and tr(A) = 1 — 4 + 7 = 4 

diagonal of B = {2,3, —4} and tr(B) = 2 + 3—4=1 

Moreover, 

tr(A + B) = 3 — 1+3 = 5, tr(2A) = 2 - 8 + 14 = 8, tr(A r ) = 1 - 4 + 7 = 4 

tr(AB) = 5 + 0 - 35 = -30, tr (BA) = 27 - 24 - 33 = -30 

As expected from Theorem 2.4, 

tr(A + B) = tr(A) + tr(B), tr(A r ) = tr(A), tr(2A) = 2 tr(A) 

Furthermore, although AB f BA, the traces are equal. 


Identity Matrix, Scalar Matrices 

The /(-square identity or unit matrix, denoted by I n , or simply /, is the //-square matrix with l’s on 
the diagonal and 0’s elsewhere. The identity matrix I is similar to the scalar 1 in that, for any //-square 
matrix A, 

AI = 1A= A 

More generally, if B is an m x n matrix, then Bl n = I m B = B. 

For any scalar k, the matrix kl that contains k’s on the diagonal and 0’s elsewhere is called the scalar 
matrix corresponding to the scalar k. Observe that 

(kl)A = k(IA) = kA 

That is, multiplying a matrix A by the scalar matrix Id is equivalent to multiplying A by the scalar k. 

EXAMPLE 2.8 The following are the identity matrices of orders 3 and 4 and the corresponding scalar 
matrices for k = 5: 





'1 




1 

0 

0 


1 

1 

1 


5 

0 

0 


0 

1 

0 

5 


0 

5 

0 

5 

0 

0 

1 



0 

0 

5 












Remark 1: It is common practice to omit blocks or patterns of 0’s when there is no ambiguity, as in 
the above second and fourth matrices. 


Remark 2: The Kronecker delta function Sjj is defined by 


<5 


ij 


0 if ifj 
1 if i=j 


Thus, the identity matrix may be defined by I = [<5 J. 
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2.8 Powers of Matrices, Polynomials in Matrices 


Let A be an //-square matrix over a field K. Powers of A are defined as follows: 

A 2 =AA, A 3 = A 2 A, ..., A" +1 = A"A, ..., and A 0 = / 

Polynomials in the matrix A are also defined. Specifically, for any polynomial 

f(x) — a 0 A a { x A a 2 x 2 + ■ ■ • + a„x” 

where the a, are scalars in K, /(A) is defined to be the following matrix: 

/(A) — GqI A r/jA A a 2 A~ A ■ * * A a n A 


[Note that /(A) is obtained from f(x) by substituting the matrix A for the variable x and substituting the 
scalar matrix a 0 I for the scalar « (l . | If/(A) is the zero matrix, then A is called a zero or root of f(x). 

EXAMPLE 2.9 Suppose A = | l 2 ]. Then 


3 -4 


A 2 = 


"1 

2 

"1 

2 


7 

-6' 

and A 3 = A 2 A = 

7 

-6' 

'1 

2 


-11 

38' 

3 

-4 

3 

-4 


-9 

22 

-9 

22 

3 

-4 


57 

-106 


Suppose/(x) = 2x 2 — 3x A 5 and g(x) = x 2 A 3x — 10. Then 


m = 2 


7 -6 

-9 22 


-3 


1 2 

3 -4 


' 7 -6' 

A3 

"1 

2' 

-9 22. 

.3 

-4. 


g( A ) = 

Thus, A is a zero of the polynomial g(x). 


x 2 A 3x — 10 

+ 5 

1 O' 

.0 1. 

- 10 

f! °1 


0 1 


16 

-27 

0 0 
0 0 


-18 

61 


2.9 Invertible (Nonsingular) Matrices 


A square matrix A is said to be invertible or nonsingular if there exists a matrix B such that 
AB = BA = I 


where I is the identity matrix. Such a matrix B is unique. That is, if AB l = /?, A = I and AB 2 = B 2 A = 7, 
then 


B ! = Byl = B t (AB 2 ) = (B [ A)B 2 - 1B 2 = B 2 


We call such a matrix B the inverse of A and denote it by A 1 . Observe that the above relation is 
symmetric: that is, if B is the inverse of A, then A is the inverse of B. 


EXAMPLE 2.10 Suppose that A = 


AB = 


2 5 

1 3 


and B = 


3 -5 
-1 2 


. Then 


6-5 

-10 A 10' 


1 O' 

3-3 

-5 A 6 


0 1 


and 


BA = 


6-5 15-15 

-2 A 2 -5 A 6 



1 

O' 


0 

1 


Thus, A and B are inverses. 

It is known (Theorem 3.18) that AB — l if and only if BA = 1. Thus, it is necessary to test only one 
product to determine whether or not two given matrices are inverses. (See Problem 2.17.) 

Now suppose A and B are invertible. Then AB is invertible and (AT?) -1 = 7? 'A 1 . More generally, if 
Aj,A 2 , ... ,A k are invertible, then their product is invertible and 

(A 1 A 2 ...A,r 1 =A, 1 ...A 2 1 Ar 1 

the product of the inverses in the reverse order. 
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Inverse of a 2 x 2 Matrix 

Let A be an arbitrary 2x2 matrix, say A = 


a b 
c d 


. We want to derive a formula for A 1 , the inverse 


of A. Specifically, we seek 2 2 = 4 scalars, say x,, y,, x 2 , y 2 , such that 


a b 

X 1 * 2 


"1 o' 

nr 

aX] + by l 

ax 2 + by 2 


"1 O' 

c d 

y\ yi 


0 1 

VJI 

exj + dy 1 

cx 2 + dy 2 


0 1 


Setting the four entries equal to the corresponding entries in the identity matrix yields four equations, 
which can be partitioned into two 2x2 systems as follows: 
ax | + by 1 = 1, ax 2 + by 2 = 0 

ex, + <f>’| = 0, cx 2 + dy 2 = 1 

Suppose we let |A| = ad — be (called the determinant of A). Assuming |A| 0, we can solve uniquely for 

the above unknowns x,, y ,, x 2 , y 2 , obtaining 

d —c—b a 

Xi = \ a \’ - Vl= R’ X2 = W’ y2 = R 

Accordingly, 


a b 

-1 

' d/\A\ 

-b/\A\ 

1 

d —b 

c d 


[-cm 

a/\A\_ 


—c a 



In other words, when |A| 0, the inverse of a 2 x 2 matrix A may be obtained from A as follows: 

(1) Interchange the two elements on the diagonal. 

(2) Take the negatives of the other two elements. 

(3) Multiply the resulting matrix by 1/|A| or, equivalently, divide each element by |A|. 

In case |A| =0, the matrix A is not invertible. 


EXAMPLE 2.11 Find the inverse of A = 


3 

5 


and B = 


1 

2 


3 

6 


First evaluate |A| = 2(5) — 3(4) = 10 — 12 = —2. Because |A| 0, the matrix A is invertible and 



Now evaluate |B| = 1(6) — 3(2) = 6 — 6 = 0. Because \B\ = 0, the matrix B has no inverse. 


Remark: The above property that a matrix is invertible if and only if A has a nonzero determinant is 
true for square matrices of any order. (See Chapter 8.) 


Inverse of an n x n Matrix 

Suppose A is an arbitrary //-square matrix. Finding its inverse A -1 reduces, as above, to finding the 
solution of a collection of n x n systems of linear equations. The solution of such systems and an efficient 
way of solving such a collection of systems is treated in Chapter 3. 


2.10 Special Types of Square Matrices 

This section describes a number of special kinds of square matrices. 

Diagonal and Triangular Matrices 

A square matrix D = [d-2 is diagonal if its nondiagonal entries are all zero. Such a matrix is sometimes 
denoted by 

D = diag(r/ n ,r/ 2 2 ! • • •) d nn ) 
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where some or all the d u may be zero. For example, 


3 0 0 
0-7 0 
0 0 2 


4 0 

0 -5 ’ 


6 


0 


-9 


8 


are diagonal matrices, which may be represented, respectively, by 
diag(3, —7,2), diag(4,-5), diag(6,0,-9,8) 


(Observe that patterns of 0’s in the third matrix have been omitted.) 

A square matrix A — [aJ is upper triangular or simply triangular if all entries below the (main) 
diagonal are equal to 0—that is, if ay = 0 for i > j. Generic upper triangular matrices of orders 2, 3, 4 are 
as follows: 


a n a n 
0 a 22 


bn 


b\2 b i3 

^22 b 2 2 , 

b 33 _ 


C 11 c 12 

C 13 

c 14 

c 22 

c 23 

c 24 


c 33 

C 34 



C 44 


(As with diagonal matrices, it is common practice to omit patterns of 0’s.) 
The following theorem applies. 


THEOREM 2.5: Suppose A = [ay] and B = [by] are n x n (upper) triangular matrices. Then 

(i) A + B, kA, AB are triangular with respective diagonals: 

(rtji + b n , . •., a nn T b nn ), ..., ka nn ), (r?ii&n, ..., a nn b nn ) 

(ii) For any polynomial /(x), the matrix/(A) is triangular with diagonal 

{f{au)J{a 22 ),... J{a nn )) 

(iii) A is invertible if and only if each diagonal element a n =/=■ 0, and when 4 1 exists it 
is also triangular. 

A lower triangular matrix is a square matrix whose entries above the diagonal are all zero. We note that 
Theorem 2.5 is true if we replace “triangular” by either “lower triangular” or “diagonal.” 


Remark: A nonempty collection A of matrices is called an algebra (of matrices) if A is closed under 
the operations of matrix addition, scalar multiplication, and matrix multiplication. Clearly, the square 
matrices with a given order form an algebra of matrices, but so do the scalar, diagonal, triangular, and 
lower triangular matrices. 


Special Real Square Matrices: Symmetric, Orthogonal, Normal 
[Optional until Chapter 12] 

Suppose now A is a square matrix with real entries—that is, a real square matrix. The relationship between 
A and its transpose A T yields important kinds of matrices. 


(a) Symmetric Matrices 

A matrix A is symmetric if A r = A. Equivalently, A = [ay] is symmetric if symmetric elements (mirror 
elements with respect to the diagonal) are equal—that is, if each ay = a*. 

A matrix A is skew-symmetric if A 1 = —A or, equivalently, if each ay = —a-. Clearly, the diagonal 
elements of such a matrix must be zero, because a u = — a u implies a u = 0. 

(Note that a matrix A must be square if A T = A or A 1 = —A.) 
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2 

-3 

5' 


0 

3 

-4' 


1 

0 

0 0' 
0 !. 

EXAMPLE 2.12 LetA = 

-3 

6 

7 

■ B = 

-3 

0 

5 

,C = 


5 

7 

-8 


4 

-5 

0 



(a) By inspection, the symmetric elements in A are equal, or A 1 — A. Thus, A is symmetric. 

(b) The diagonal elements of B are 0 and symmetric elements are negatives of each other, or B T = —B. 
Thus, B is skew-symmetric. 

(c) Because C is not square, C is neither symmetric nor skew-symmetric. 


(b) Orthogonal Matrices 

A real matrix A is orthogonal if A r = A -1 —that is, if A A 1 — A r A = I. Thus, A must necessarily be square 
and invertible. 


EXAMPLE 2.13 


Let A = 


1 

9 

4 

9 

8 

9 


8 

9 

4 

9 

1 

9 


4 

9 



. Multiplying A by A r yields /; that is, AA 7 = I. This means 


A t A = /, as well. Thus, A r = A 1 ; that is, A is orthogonal. 


Now suppose A is a real orthogonal 3x3 matrix with rows 

«i = (ai,a 2 ,a 3 ), u 2 = (b u b 2 ,b 3 ), u 3 = (q, c 2 , c 3 ) 

Because A is orthogonal, we must have AA T = I. Namely, 


#2 #3 


1 

a 

_i 


1 

o 

o 
_1 

Z?i b 2 b 3 


a 2 b 2 c 2 

= 

0 1 0 

. C 1 c 2 c 3 . 


a 3 b 3 c 3 


o 

o 


Multiplying A by A T and setting each entry equal to the corresponding entry in / yields the following nine 
equations: 

af + a 2 + a 3 = 1, a l b l + a 2 b 2 + a 3 b 3 = 0, a x c x + a 2 c 2 + a 3 c 3 = 0 

b l a l + b 2 a 2 + b 3 a 3 = 0, b\ + b 2 + b\ = 1, b 1 c l + b 2 c 2 + b 3 c 3 = 0 

c l a l + c 2 a 2 + c 3 a 3 = 0, c x b x + c 2 b 2 + c 3 b 3 — 0, c\ + c 2 + c 3 = 1 

Accordingly, iq ■ u x — 1, u 2 ■ u 2 — 1, u 3 ■ u 3 = 1, and «,■ -Uj — 0 for i j. Thus, the rows u x , u 2 , u 3 are 
unit vectors and are orthogonal to each other. 


Generally speaking, vectors u x , u 2 ..... u m in R" are said to form an orthonormal set of vectors if the 
vectors are unit vectors and are orthogonal to each other; that is. 


«, ' Uj = 


0 if i ^ j 
1 if i = j 


In other words, u, ■ Uj = Sjj where d (; is the Kronecker delta function. 

We have shown that the condition AA T = I implies that the rows of A form an orthonormal set of 
vectors. The condition A T A = 1 similarly implies that the columns of A also form an orthonormal set 
of vectors. Furthermore, because each step is reversible, the converse is true. 

The above results for 3 x 3 matrices are true in general. That is, the following theorem holds. 


THEOREM 2.6: Let A be a real matrix. Then the following are equivalent: 

(a) A is orthogonal. 

(b) The rows of A form an orthonormal set. 

(c) The columns of A form an orthonormal set. 


For n = 2, we have the following result (proved in Problem 2.28). 
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THEOREM 2.7: Let A be a real 2x2 orthogonal matrix. Then, for some real number 9, 


cos 6 

sin 6 



cos 6 

sind 

— sind 

cos 6 

or 

A = 

sind 

— cos 6 


(c) Normal Matrices 

A real matrix A is normal if it commutes with its transpose A T —that is, if A A 1 = A T A . If A is symmetric, 
orthogonal, or skew-symmetric, then A is normal. There are also other normal matrices. 


EXAMPLE 2.14 Let A 


AA 1 - 


6 -3 
3 6 


. Then 


'6 -3' 

6 

3' 


'45 

O' 

3 6 

-3 

6 


0 

45 


and 


A t A = 


6 

3' 

'6 

-3' 


'45 

O' 

-3 

6 

3 

6 


0 

45 


Because AA T = A r A, the matrix A is normal. 


2.11 Complex Matrices 


Let A be a complex matrix—that is, a matrix with complex entries. Recall (Section 1.7) that if z = a + bi is 
a complex number, then z = a — bi is its conjugate. The conjugate of a complex matrix A, written A, is the 
matrix obtained from A by taking the conjugate of each entry in A. That is, if A = [aA, then A = [by ]. 
where by = ay. (We denote this fact by writing A = [aA.) 

The two operations of transpose and conjugation commute for any complex matrix A, and the special 
notation A H is used for the conjugate transpose of A. That is, 

A h = {A) t = (A?) 

Note that if A is real, then A H = A 1 . [Some texts use A* instead of A H .] 


EXAMPLE 2.15 


Let A = 


2 T 8/ 
6 i 


5-3 ; 

i -4 ; 


4-7/ 
3 + 2/ 


. Then A H 


2 — 8/ —6 i 

5 T 3/ 1 ; 4/ 

4 + 7/ 3 - 2/ 


Special Complex Matrices: Hermitian, Unitary, Normal [Optional until Chapter 12] 

Consider a complex matrix A. The relationship between A and its conjugate transpose A H yields important 
kinds of complex matrices (which are analogous to the kinds of real matrices described above). 


A complex matrix A is said to be Hermitian or skew-Hermitian according as to whether 
A H = A or A" = -A. 

Clearly, A = [aA is Hermitian if and only if symmetric elements are conjugate—that is, if each ay = a -— 
in which case each diagonal element a n must be real. Similarly, if A is skew-symmetric, then each diagonal 
element a u = 0. (Note that A must be square if A" = A or A" = —A.) 

A complex matrix A is unitary if A 11 A 1 = A A\" = I —that is, if 

A H = A- 1 . 

Thus, A must necessarily be square and invertible. We note that a complex matrix A is unitary if and only if 
its rows (columns) form an orthonormal set relative to the dot product of complex vectors. 

A complex matrix A is said to be normal if it commutes with A H —that is, if 


aa h =a h a 
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(Thus, A must be a square matrix.) This definition reduces to that for real matrices when A is real. 


EXAMPLE 2.16 Consider the following complex matrices: 



3 

1-2/ 4 + 7/1 

1 

5 = 2 

1 

—i 

— 1 + / 


'2 + 3/ 1 

/ 1 + 2/ 

A = 

1 +2 i 

-4 

-2/ 

i 

1 

1 + i 

C — 


4-7/ 

2/ 

5 


1 + i 

— 1 + i 

0 




(a) By inspection, the diagonal elements of A are real, and the symmetric elements 1 — 2 i and 1 + 2 i are 
conjugate, 4 + li and 4 — li are conjugate, and —2 i and 2 i are conjugate. Thus, A is Hermitian. 

(b) Multiplying B by B H yields /; that is, BB H = I. This implies B H B = 7, as well. Thus, B H — B~ l , which 
means B is unitary. 

(c) To show C is normal, we evaluate CC H and C H C\ 


'2 + 3/ 1 

7 

1 

<N 


14 4-4/' 

i 1 2 i 

1 1-2/ 


4 + 4/ 6 


and similarly C H C = 


14 

4 + 4 i 


4-4/ 

6 


. Because CC H 


C H C, the complex matrix C is normal. 


We note that when a matrix A is real, Hermitian is the same as symmetric, and unitary is the same as 
orthogonal. 


2.12 Block Matrices 


Using a system of horizontal and vertical (dashed) lines, we can partition a matrix A into submatrices 
called blocks (or cells) of A. Clearly a given matrix may be divided into blocks in different ways. For 
example, 


'1 

—2 

0 

1 

3] 


‘1 

-2 1 

0 

1 

3] 


' 1 

-2 

0 

'1 

3 

2 

3 i 

5 

7 i 

-2 


2 

3 ”1 

5 

7 

-2 


2 

3 

5 

i7 

-2 

3 

1 ' 

4 

5 

9 

5 

3 

1 ' 

4 

5 

9 

5 

3 

1 

4 

'5 

9 

4 

6 i 

-3 

1 i 

8 


4 

6 

-3 

1 

8 


4 

6 

-3 

,1 

8 


The convenience of the partition of matrices, say A and B, into blocks is that the result of operations on A 
and B can be obtained by carrying out the computation with the blocks, just as if they were the actual 
elements of the matrices. This is illustrated below, where the notation A = [Ay] will be used for a block 
matrix A with blocks Ay. 

Suppose that A = [Ay] and B = [By] are block matrices with the same numbers of row and column 
blocks, and suppose that corresponding blocks have the same size. Then adding the corresponding blocks 
of A and B also adds the corresponding elements of A and B. and multiplying each block of A by a scalar k 
multiplies each element of A by k. Thus, 



An + 

^12 +-#12 

■ ■ A hl 

+ B\n 

= 

^21 + 5 21 

^22 + ^22 

■ ■ a 2 ,, 

+ B 2 n 


- ‘ Vi; 1 T 1 

^ m2 4 ^m2 

A 

• • /x mn 

- 

Mu ^12 

kA\ n 




kA 2i M 22 

... kA ln 




_kA m i kA m 2 

kA 

• • • ™ y mn _ 





kA — 









40 


CHAPTER 2 Algebra of Matrices 


The case of matrix multiplication is less obvious, but still true. That is, suppose that U = [U ik ] and 
V = [V k j] are block matrices such that the number of columns of each block U ik is equal to the number of 
rows of each block V kj . (Thus, each product U lk V kj is defined.) Then 



-w n 

W 12 .. 

.. w u - 


UV = 

W21 

W 22 • 

, where W tj = U n V l} + U i2 V 2j + ■ 

■■ + u ip v pj 


W m] 

w m2 .. 

w 

• • rr mn J 



The proof of the above formula for UV is straightforward but detailed and lengthy. It is left as an exercise 
(Problem 2.85). 


Square Block Matrices 

Let M be a block matrix. Then M is called a square block matrix if 

(i) M is a square matrix. 

(ii) The blocks form a square matrix. 

(iii) The diagonal blocks are also square matrices. 

The latter two conditions will occur if and only if there are the same number of horizontal and vertical 
lines and they are placed symmetrically. 

Consider the following two block matrices: 


'1 

2 ; 

3 

4 

!5 1 

1 

1 1 

1 

1 

1 1 

9 


7 

6 

~I 5 

4 

4 1 

4~ 

4 

14 

3 

5 ! 

3 

5 

! 3 . 


'1 2 ,3 4 ,5' 

_1 _l_'i _ 1 1 1_ 
and B= 9 8 ! 7 6 "15 
_4_ 4_'4_ 4 '4 
3 5 13 5 3 


The block matrix A is not a square block matrix, because the second and third diagonal blocks are not 
square. On the other hand, the block matrix B is a square block matrix. 


Block Diagonal Matrices 

Let M = [Ajj] be a square block matrix such that the nondiagonal blocks are all zero matrices; that is, 
Ajj — 0 when i =/=■ j. Then M is called a block diagonal matrix. We sometimes denote such a block diagonal 
matrix by writing 

M — diag(A n ,A 2 2 , • • ■ ,A rr ) or M — A n © A 22 © • • • © A rr 

The importance of block diagonal matrices is that the algebra of the block matrix is frequently reduced to 
the algebra of the individual blocks. Specifically, suppose/(x) is a polynomial and M is the above block 
diagonal matrix. Then f(M) is a block diagonal matrix, and 

f(M) = diag (f(A u ),f(A 22 ),... ,f(A rr )) 

Also, M is invertible if and only if each A u is invertible, and, in such a case, M 1 is a block diagonal 
matrix, and 

AT 1 = diag(A n 1 ,A22 1 , • • • ,A^) 

Analogously, a square block matrix is called a block upper triangular matrix if the blocks below the 
diagonal are zero matrices and a block lower triangular matrix if the blocks above the diagonal are zero 
matrices. 
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EXAMPLE 2.17 Determine which of the following square block matrices are upper diagonal, lower 
diagonal, or diagonal: 


A = 


1 2 ; 0 

3 _ _ 4 1 _ 5 
0 O'] 6 


‘1 

; ° 

0 

L °" 

“2" 

~i 3" 

4 

i 0 

5 

0 

6 

L°- 

i 9 

o" 

~i 7" 

8 



'10 o' 


'1 2 i O' 

C = 

0 ! 2 3 

, D = 

3 4 5 


1 

VO 

^1- 

o 
_1 


0 6i 7 


(a) A is upper triangular because the block below the diagonal is a zero block. 

(b) B is lower triangular because all blocks above the diagonal are zero blocks. 

(c) C is diagonal because the blocks above and below the diagonal are zero blocks. 

(d) D is neither upper triangular nor lower triangular. Also, no other partitioning of D will make it into 
either a block upper triangular matrix or a block lower triangular matrix. 


SOLVED PROBLEMS 


Matrix Addition and Scalar Multiplication 


2.1 Given A — 


-2 3 

5 -6 


and B = 


3 

-7 


, find: 


(a) A + B, (b) 2A-3B. 

(a) Add the corresponding elements: 

A + B — 


1 + 3 

-2 + 0 

3 + 2' 


'4 -2 5' 

4-7 

5+1 

-6 + 8 


-3 6 2 


(b) First perform the scalar multiplication and then a matrix addition: 


2A - 3B = 


2-4 6 

8 10 -12 


+ 


-9 0 -6 

21 -3 -24 


-7 -4 
29 7 


0 

-36 


(Note that we multiply B by —3 and then add, rather than multiplying B by 3 and subtracting. This usually 
prevents errors.) 


2.2. Find x, y, z, t where 3 

x y 

= 

x 6 

+ 

4 x + y~ 


t 


-1 2 1 


_z +1 3 


Write each side as a single equation: 


3x 

3y 


x + 4 

x + y+ 6' 

3z 

3t 


z + t — 1 

2t + 3 


Set corresponding entries equal to each other to obtain the following system of four equations: 

3x = x+4, 3y = x + y + 6, 3z = z + t — 1, 3t = 2t + 3 
or 2x = 4, 2y = 6 + x, 2z = t — 1, t = 3 

The solution is x = 2, y = 4, z = 1, t = 3. 


Prove Theorem 2.1 (i) and (v): (i) (A + B) + C = A + (B + C), (v) k(A + B) = kA + kB. 
Supposed = [ay], B — [by], C = [ Cy ]. The proof reduces to showing that corresponding y-entries in 
each side of each matrix equation are equal. [We prove only (i) and (v), because the other parts of 
Theorem 2.1 are proved similarly.] 


2.3. 








42 


CHAPTER 2 Algebra of Matrices 


(i) The ly-entry of A + B is + by hence, the ((/-entry of (A + B) + C is (a y - + b^) + c if On the other hand, 
the (/-entry of B + C is by + cy, hence, the //-entry of A + {B + C) is ay + ( by + Cy). However, for scalars 
in K, 

( a ij + by) + c ij = a ij + (by + fy) 

Thus, (A + B) + C and A + (B + C) have identical //-entries. Therefore, (A + B) + C = A + (B + C). 
(v) The //-entry of A + B is ay + by, hence, + b t j) is the //-entry of k(A +5). On the other hand, the ij- 
entries of AA and kB are ka^ and kby, respectively. Thus, ka^ + kb t j is the //-entry of kA + kB. However, for 
scalars in K, 

K a ij + b ij ) = ka j + kbj 

Thus, k(A + B) and kA + kB have identical //-entries. Therefore, k{A + B) = kA + kB. 


Matrix Multiplication 



3' 


4' 

2.4. Calculate: (a) [8,—4,5] 

2 

-1 

, (b) [6,-1,7,5] 

-9 

-3 

2 


(a) Multiply the corresponding entries and add: 


(c) 


[3,8,-2,4] 


5 

-1 

6 


[8,-4,5] 


3 

2 

-1 


8(3) + (—4)(2) + 5(—1) = 24 - 8 - 5 = 11 


(b) Multiply the corresponding entries and add: 


[6,-1,7,5] 


4 

-9 

-3 

2 


= 24 + 9-21 + 10 = 22 


(c) The product is not defined when the row matrix and the column matrix have different numbers of elements. 


2.5. Let (r x s) denote an r x s matrix. Find the sizes of those matrix products that are defined: 

(a) (2 x 3)(3 x 4), (c) (1 x 2)(3 x 1), (e) (4 x 4)(3 x 3) 

(b) (4 x 1)(1 x 2), (d) (5 x 2)(2 x 3), (f) (2 x 2)(2 x 4) 

In each case, the product is defined if the inner numbers are equal, and then the product will have the size of 
the outer numbers in the given order. 


(a) 2x4, (c) not defined, (e) not defined 

(b) 4x2, (d) 5 x 3, (f) 2x4 


2 . 6 . 


Let A = 


1 

2 



and B 


2 

3 


0 

-2 


-4 

6 


. Find: (a) AB, (b) BA. 


(a) Because A is a 2 x 2 matrix and B a 2 x 3 matrix, the product AB is defined and is a 2 x 3 matrix. To obtain 


the entries in the first row of AB, multiply the first row [1,3] of A by the columns 
respectively, as follows: 


'2' 


0 - 


" —4 

3 

5 

-2 


6 


of B, 


1 3' 

'2 0 —4 


'2 + 9 0-6 -4+18' 


'll -6 14' 

2 -1 

3-2 6 






AB = 
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To obtain the entries in the second row of AB, multiply the second row [2, —1] of A by the columns of B: 


Thus, 


'1 3' 

2 0 

-4 


11 

-6 14 

2 -1 

.3 -2 

6 


4-2 

1 0+2 -8-6 



'll 

-6 

14" 



AB = 







1 

2 

-14 



(b) The size of B is 2 x 3 and that of A is 2 x 2. The inner numbers 3 and 2 are not equal; hence, the product BA 
is not defined. 


2.7. 


Find AB, where A = 


2 

4 


3 

-2 



and B — 


2 

1 

4 


-1 

3 

1 


0 6 
-5 1 
-2 2 


Because A is a 2 x 3 matrix and B a 3 x 4 matrix, the product AB is defined and is a 2 x 4 matrix. Multiply 
the rows of A by the columns of B to obtain 


4 + 3-4 

-2 + 9-1 

0-15 + 2 

12 + 3-2 ' 


3 

6 

-13 

13' 

8-2 + 20 

-4-6 + 5 

0+10-10 

24-2+10 


26 

-5 

0 

32 


2.8. Find: (a) 


2 


2 

1 6 

-7 

, (b) 

i r- 
1 

_i 

-3 5 


(c) [2,-7] 


1 6 

-3 5 


(a) The first factor is 2 x 2 and the second is 2 x 1, so the product is defined as a 2 x 1 matrix: 


1 6' 

2 


' 2-42 ' 


'-40' 

-3 5 

-7 


-6-35 


-41 


(b) The product is not defined, because the first factor is 2 x 1 and the second factor is 2 x 2. 

(c) The first factor is 1 x 2 and the second factor is 2 x 2, so the product is defined as a 1 x 2 (row) matrix: 


[2,-7] 


6 

5 


[2 + 21, 12-35] = [23,-23] 


2.9. Clearly, OA = 0 and AO = 0, where the 0’s are zero matrices (with possibly different sizes). Find 
matrices A and B with no zero entries such that AB = 0. 


Let A = 


1 

2 


2 

4 


and B = 


6 

-3 


2 

-1 


Then AB = 


0 

0 


0 

0 


2.10. Prove Theorem 2.2(i): {AB)C = A(BC). 

Let A = [«„], B=[b jk \, C =[<+,], and let AB = S = [s ft ], BC=T = [t fl \. Then 

m n 

Sik = E aijbjk and tji = E bjkCki 

7=1 fc= 1 

Multiplying S = AB by C, the //-entry of (AB) C is 

n n m 

s il c ll + s i2 c 2l 3-b s in c nl = E s ik c H = E J2( a ijbjk) c kl 

k= 1 *= 17=1 

On the other hand, multiplying A by T = BC, the //-entry of A{BC) is 

m m n 

a ilhl + a i2 t 2l + • • • + a im t m i = E Oijtj, = E E a ij(bjk c kl) 

1=1 j=lk=l 

The above sums are equal; that is, corresponding elements in ( AB)C and A(BC) are equal. Thus, 
(AB)C = A(BC). 
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2.11. Prove Theorem 2.2(ii): A(B + C) = AS + AC. 

Let A = [a^], B = [fy t ], C = [c jk \, and let D = B + C = [</,*], E = AB = [e lk ], F = AC = [4]. Then 


djk ~ tyk + c jk , 

Thus, the ik-e ntry of the matrix AB + AC is 


e ik E a ij b jki fik E a ij c jk 

7=1 7=1 


<'/s +4 = E a ij b jk + E a ifjk = E a y( b jk + ^k) 

7=1 7=1 7=1 

On the other hand, the /L-entry of the matrix AD = A(B + C) is 

m m 

a n d \k + a a d 2 k + ''' + a im d mk = E ayd* = E a ij( b jk + c jk ) 

7=1 7=1 

Thus, A{B + C) = AB + AC, because the corresponding elements are equal. 


Transpose 

2.12. Find the transpose of each matrix: 


1 -2 3 

7 8-9 


1 2 3 
B = 2 4 5 
3 5 6 


C — [1,-3,5, —7], 


Rewrite the rows of each matrix as columns to obtain the transpose of the matrix: 

r , r, „ r 11 


1 7 

A t = —2 8 

3 -9 


1 2 3 

B t = 2 4 5, 

3 5 6 


D = -4 

6 


D T = [2, —4, 6] 


(Note that B T = B\ such a matrix is said to be symmetric. Note also that the transpose of the row vector C is a 
column vector, and the transpose of the column vector D is a row vector.) 


2.13. Prove Theorem 2.3(iv): ( AB) T = B T A T . 

Let A = [a lk ] and B = [ b k j}. Then the (/-entry of AB is 

a n b ij + a i2 b 2j + • • • + a im b mj 

This is the y'i-entry (reverse order) of (AB) r . Now column j of B becomes row j of B T , and row i of A becomes 
column i of A T . Thus, the //-entry of B T A T is 

i b ip b 2j, ■■■, b mj\ hi ^a i2 ,..., a im ] T = b\jd n + b 2j a i2 + ■ ■ ■ + b mj a im 

Thus, ( AB) t = B t A t on because the corresponding entries are equal. 

Square Matrices 

2.14. Find the diagonal and trace of each matrix: 

2 4 8' 

3-79, (c) C — 

-5 0 2 _ 

(a) The diagonal of A consists of the elements from the upper left comer of A to the lower right comer of A or, 
in other words, the elements an, a 22 , « 33 . Thus, the diagonal of A consists of the numbers 1, —5, and 9. The 
trace of A is the sum of the diagonal elements. Thus, 

tr(A) = 1 - 5 + 9 = 5 

(b) The diagonal of B consists of the numbers 2, —7, and 2. Hence, 

tr(B) = 2- 7 + 2 = -3 



'1 3 6 

(a) A= 2 -5 8 

4-2 9 


(b) B = 


(c) The diagonal and trace are only defined for square matrices. 
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2.15. Let A = 

(a) A 2 , 


1 2 
4 -3 
(b) A 3 , 


, and let f(x) = 2.x 3 — Ax + 5 and g(x) = x 2 + 2x+ 11. Find 
(c)f(A), (d) g(A). 


1 2' 

1 2 


'1 + 8 2-6" 


9 -4' 

4 -3 

4 -3 


4-12 8 + 9 


-8 17 


(a) A 2 =AA = 

(b) A 3 = AA 2 = 

(c) First substitute A for x and 5/ for the constant in/(x), obtaining 


1 2' 

9 -4' 


'9-16 -4 + 34' 


'-7 30' 

4 -3 

-8 17 


36 + 24 -16 - 51 


60 -67 


/(A) = 2A 3 - 4A + 5/ = 2 


-7 30 

60 -67 


-4 


1 2 
4 -3 


Now perform the scalar multiplication and then the matrix addition: 


/(A) = 


-14 60 

120 -134 


+ 


-4 -8 
-16 12 


+ 


5 0 
0 5 


+ 5 


-13 


1 0 
0 1 


52 


104 -117 

(d) Substitute A for x and 11/ for the constant in g(x), and then calculate as follows: 


g(A) = A 2 + 2A — 11/ = 


9 -4 
-8 17 


+ 


9 -4 
-8 17 

4 


+ 2 


-6 


+ 


1 2 

4 -3 
-11 0 
0 -11 


- 11 


1 0 
0 1 


0 0 
0 0 


Because g(A) is the zero matrix, A is a root of the polynomial g(x). 


2.16. Let A = 


1 3 

4 -3 

(b) Describe all such vectors. 


. (a) Find a nonzero column vector u = 


such that Am = 3m. 


(a) First set up the matrix equation Au = 3 u, and then write each side as a single matrix (column vector) as 
follows: 


1 3' 

X 

= 3 

X 

, and then 

' x + 3y ' 


3x 

4 -3 

_y_ 


y_ 


4x - 3y 


3 y_ 


Set the corresponding elements equal to each other to obtain a system of equations: 


x + 3y = 3x 2x — 3y = 0 

4x — 3y = 3y ° r 4x - 6v = 0 


or 2x — 3y = 0 


The system reduces to one nondegenerate linear equation in two unknowns, and so has an infinite number 
of solutions. To obtain a nonzero solution, let, say, y = 2; thenx = 3. Thus, u = (3, 2) r is a desired nonzero 
vector. 


(b) To find the general solution, set y = a, where a is a parameter. Substitute y — a into 2x — 3v = 0 to obtain 
x = | a. Thus, u = (|a,o) r represents all such solutions. 


Invertible Matrices, Inverses 



'1 

0 

2' 


'-11 

2 

2' 

2.17. Show that A = 

2 

-1 

3 

and B = 

-4 

0 

1 


4 

1 

8 


6 

-1 

-1 


are inverses. 


Compute the product AB, obtaining 



'-11 +0+ 12 2 + 0-2 2 + 0-2" 


'1 0 O' 

AB = 

-22 + 4+ 18 4 + 0-3 4-1-3 

= 

0 1 0 


—44 — 4 + 48 8 + 0- 8 8 + 1- 8 


0 0 1 


Because AB = /, we can conclude (Theorem 3.18) that BA = I. Accordingly, A and B are inverses. 
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2.18. Find the inverse, if possible, of each matrix: 


(a) A 


5 3 
4 2 


(b) B = 


2 -3 
1 3 ’ 


(c) 


-2 6 

3 -9 


Use the formula for the inverse of a 2 x 2 matrix appearing in Section 2.9. 

(a) First find |A| = 5(2) — 3(4) = 10 — 12 = —2. Next interchange the diagonal elements, take the negatives 
of the nondiagonal elements, and multiply by 1/|A|: 



(b) First find |B| = 2(3) — (—3) (1) =6 + 3 = 9. Next interchange the diagonal elements, take the negatives of 
the nondiagonal elements, and multiply by 1/|B|: 


3 

3' 


1 1 " 

3 3 

-1 

2 


1 2 

L J 


. 9 9. 


(c) First find |C| = —2(—9) — 6(3) = 18 — 18 = 0. Because |C| = 0, C has no inverse. 



"1 

i r 



2.19. Let A = 

0 

1 2 

. Find A- 1 = 

U *2 *3 

Tl T2 T3 


1 

2 4 


.Zl ^2 Z3_ 


Multiplying A by A 1 and setting the nine entries equal to the nine entries of the identity matrix I yields the 
following three systems of three equations in three of the unknowns: 


*i+ yi + u = i 

yi + 2z x = o 

X\ + 2yi + 4zi — 0 


*2 + y2+ Z2 = 0 
y 2 + 2^2 = 1 
x 2 + 2y 2 + 4 z 2 = 0 


[Note that A is the coefficient matrix for all three systems.] 
Solving the three systems for the nine unknowns yields 


x 3 + y 3 + z 3 = 0 
y 3 + 2 z 3 = 0 
x 3 + 2y 3 + 4" 3 = 1 


x i — 0, y i — 2, zi = —1; x 2 = -2, y 2 = 3, z 2 = -1; x 3 = 1, y 3 =-2, z 3 = 1 


0 -2 1 

Thus, A~‘ = 2 3-2 

-1 -1 1 

(Remark: Chapter 3 gives an efficient way to solve the three systems.) 


2.20. Let A and B be invertible matrices (with the same size). Show that AB is also invertible and 
(AB)~ l = 5 _1 A _1 . [Thus, by induction, (A,A 2 ... A m )~* = A,], 1 .. ,A 2 1 A) -1 .] 

Using the associativity of matrix multiplication, we get 


(AB)(B -1 A -1 ) =A(BB- l )A-' = A/A -1 =AA _1 =1 
(JS _1 A _1 )(AB) = B-\A- l A)B = A 1 IB = B ] B = I 


Thus, (AB) -1 =B- 1 A-'. 
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Diagonal and Triangular Matrices 

2.21. Write out the diagonal matrices A = diag(4, —3,7), B — diag(2, —6), C = diag(3, —8,0,5). 


Put the given scalars on the diagonal and 0’s elsewhere: 

3 

-8 

0 

5_ 

2.22. Let A = diag(2,3,5) and B = diag(7,0, —4). Find 

(a) AB, A 2 , B 2 ; (b) /(A), where f(x) — x 2 + 3x — 2; (c) A -1 and B x . 

(a) The product matrix AB is a diagonal matrix obtained by multiplying corresponding diagonal entries; hence, 

AB = diag(2(7), 3(0), 5(—4)) = diag( 14,0,-20) 

Thus, the squares A 2 and B 2 are obtained by squaring each diagonal entry; hence, 

A 2 = diag(2 2 ,3 2 ,5 2 ) = diag(4,9,25) and B 2 = diag(49,0,16) 

(b) /(A) is a diagonal matrix obtained by evaluating/(x) at each diagonal entry. We have 

/(2) =4 + 6- 2 = 8, /(3) = 9 + 9-2= 16, /(5) = 25 + 15 - 2 = 38 

Thus, /(A) = diag(8,16,38). 

(c) The inverse of a diagonal matrix is a diagonal matrix obtained by taking the inverse (reciprocal) 
of each diagonal entry. Thus, A~ ! = diag(/l,J), but B has no inverse because there is a 0 on the 
diagonal. 


A = 


4 0 0 
0-3 0 
0 0 7 


B = 


2 0 
0 -6 


C = 


2.23. Find a 2 x 2 matrix A such that A 2 is diagonal but not A. 


Let A = 


1 

3 


. Then A 2 = 


7 

0 


0 

7 


, which is diagonal. 


2.24. Find an upper triangular matrix A such that A 3 


8 -57 

0 27 


Set A = 


x 

0 


y 

z 


. Then x 3 


8, so x = 2; and z 3 = 27, so z = 3. Next calculate A 3 using x = 2 and y = 3: 


2 y 

'2 y 


'4 5yl 

and A 3 = 

'2 y 

4 5y" 


'8 19/ 

0 3 

0 3 


0 9 


0 3 

0 9 


0 27 


Thus, 19y = —57, or y = —3. Accordingly, A 


2 -3 
0 3 


2.25. Let A = [aJ and B = [/■•] be upper triangular matrices. Prove that AB is upper triangular with 
diagonal a n b n , a 22 b 2 2 ,... , a nn b nn . 

Let AB = [cjj]. Then c,y = )T)r=i a ikb k j an d c u = YTk= l a ikbu■ Suppose i > j. Then, for any k, either i > k or 
k > j , so that either a jk = 0 or b k j = 0. Thus. c,y = 0, and AB is upper triangular. Suppose i = j. Then, for k < i, 
we have a ik = 0; and, for k > i, we have b M = 0. Hence, c u — a u b u , as claimed. [This proves one part of 
Theorem 2.5(i); the statements for A + B and kA are left as exercises.] 
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Special Real Matrices: Symmetric and Orthogonal 

2.26. Determine whether or not each of the following matrices is symmetric —that is, A T = A —or 
skew-symmetric —that is, A 1 — —A: 

'5-71] [04-3' 

(a) A = -7 8 2 , (b) B = -4 0 5 , (c) C = 

1 2 —4-J [ 3 -5 0_ 

(a) By inspection, the symmetric elements (mirror images in the diagonal) are —7 and —7, 1 and 1, 2 and 2. 
Thus, A is symmetric, because symmetric elements are equal. 

(b) By inspection, the diagonal elements are all 0, and the symmetric elements, 4 and —4, —3 and 3, and 5 and 
—5, are negatives of each other. Hence, B is skew-symmetric. 

(c) Because C is not square, C is neither symmetric nor skew-symmetric. 


0 0 0 ' 
0 0 0 


4 x ~\~ 2 

2 . 27 . Suppose B = is symmetric. Find x and B. 

Lx 3 x I 1 

Set the symmetric elements x + 2 and 2x — 3 equal to each other, obtaining 2x — 3 = x + 2 or x = 5. Hence, 

d _ [4 7] 

B ~ 7 6 • 


2 . 28 . Let A be an arbitrary 2x2 (real) orthogonal matrix. 

(a) Prove: If (a, b) is the first row of A, then a 2 + b 2 = 1 and 


a b 
—b a 


or A = 


a b 
b —a 


(b) Prove Theorem 2.7: For some real number 9, 


cos 6 sin 9 
— sin 9 cos 9 


or A — 


cos 9 sin 9 
sin 9 — cos 9 


(a) Suppose (x,y) is the second row of A. Because the rows of A form an orthonormal set, we get 

a 2 + b 2 = 1, x 2 + y 2 = 1, ax + by = 0 

Similarly, the columns form an orthogonal set, so 

ab + x y = 0 


a 2 +X 2 = 1, b 2 +y 2 — 1, 

Therefore, x 2 — 1 — a 2 = b 2 , whence x — ±b. 

Case (i): x = b. Then b(a + y) = 0, so y — —a. 
Case (ii): x = —b. Then b(y — a) = 0, so y = a. 
This means, as claimed. 


a b 
—b a 


or A = 


a b 
b —a 


(b) Because a 2 + b 2 = 1, we have —1 < a < 1. Let a = cos 0. Then b 2 = 1 — cos 2 9, so b = sin 0. This proves 
the theorem. 

2 . 29 . Find a 2 x 2 orthogonal matrix A whose first row is a (positive) multiple of (3,4). 

Normalize (3,4) to get (|,|). Then, by Problem 2.28, 



' 3 4' 


'3 4" 

A = 

5 5 

or A = 

5 5 


4 3 


4 3 


. 5 5. 


.5 5 _ 


2 . 30 . Find a 3 x 3 orthogonal matrix P whose first two rows are multiples of u l = (1,1,1) and 
n 2 = (0, — 1,1), respectively. (Note that, as required, w, and u 2 are orthogonal.) 
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First find a nonzero vector u 3 orthogonal to u t and u 2 ; say (cross product) u 3 = u x x w 2 = (2, —1, —1). Let A be 
the matrix whose rows are u l , u 2 , « 3 ; and let P be the matrix obtained from A by normalizing the rows of A. Thus, 



'i i r 


"l/V3 

1/V3 

l/vr 

A = 

0 -1 1 

and P = 

0 

-1/V2 

1/V2 


.2 -i -i. 


.2/VS 

-l/VS 

-l/VS. 


Complex Matrices: Hermitian and Unitary Matrices 


2.31. Find A H where (a) 


A = 


3 - 5 i 
6 + li 


2 + 4 i 
1 + 8 i 


(b) A 


2 — 3/ 5 + 8/ 
-4 3 - li 

—6 — i 5 i 


Recall that A H = A T , the conjugate tranpose of A. Thus, 


(a) A h 


3 + 5/ 6 - li 
2-4/ 1 - 8/ 


(b) A h 


2 + 3/ -4 

5-8/ 3 + li 


—6 + / 
-5/ 


2.32. Show that A = 




is unitary. 


The rows of A form an orthonormal set: 


Thus, A is unitary. 



2 2 

- - i, - / 

3 3 



1 

3 



1 2 2 

-/, — / 

3 3 3 


1 2 

- / 

3 3 





2 . 33 . Prove the complex analogue of Theorem 2.6: Let A be a complex matrix. Then the following are 
equivalent: (i) A is unitary, (ii) The rows of A form an orthonormal set. (iii) The columns of A form 
an orthonormal set. 

(The proof is almost identical to the proof on page 37 for the case when A is a 3 x 3 real matrix.) 
First recall that the vectors u u u 2 ,..,. u n in C" form an orthonormal set if they are unit vectors and are 
orthogonal to each other, where the dot product in C" is defined by 

(ai ,a 2 , •■■,«„) • {b 3 ,b 2 ,...,b n ) = d\b\ +a 2 h 2 H-b a n b n 

Suppose A is unitary, and R { ,R 2 ,... ,R n are its rows. Then R\, R 2 ,....R 7 are the columns of A H . Let 
AA h = [ Cij \. By matrix multiplication, Cy = R t RJ = R , ■ Rj. Because A is unitary, we have AA H = I. Multi¬ 
plying A by A H and setting each entry c,-,- equal to the corresponding entry in I yields the following n 2 
equations: 

^i^i = L R 2 R 2 = 1, ■■■, R n R n = L and R, ■ R } = 0. for//;' 

Thus, the rows of A are unit vectors and are orthogonal to each other; hence, they form an orthonormal set of 
vectors. The condition A 7 A = I similarly shows that the columns of A also form an orthonormal set of vectors. 
Furthermore, because each step is reversible, the converse is true. This proves the theorem. 

Block Matrices 

2 . 34 . Consider the following block matrices (which are partitions of the same matrix): 



fl -2,0 1 i 3' 


'1 -2 0 1 3' 

(a) 

2 3 1 5 7 1 -2 

3 1 ! 4 5! 9 

, (b) 

2 3 5 7 -2 

3 14 5 9 
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Find the size of each block matrix and also the size of each block. 

(a) The block matrix has two rows of matrices and three columns of matrices; hence, its size is 2 x 3. The block 
sizes are 2 x 2, 2 x 2, and 2 x 1 for the first row; and 1 x 2. 1 x 2, and 1 x 1 for the second row. 

(b) The size of the block matrix is 3 x 2; and the block sizes are 1 x 3 and 1 x 2 for each of the three rows. 


2 . 35 . Compute AB using block multiplication, where 


A = 

'1 

3 

2 i r 

4 i 0 

and B — 

"12 3 

4 5 6 

r 

_ 1 _ 


"o 

0 2 


0 0 0 

1 


Here A = 


E 

0lx2 


and B = 


R 

0lx3 


, where E , F, G, R, S, T are the given blocks, and 0 lx2 and 0 lx3 


are zero matrices of the indicated sites. Hence, 


AB = 


ER 

0lx3 


ES +FT 
GT 


9 

12 

15" 


"3" 

+ 

"l" 

19 

26 

33 


7 

0 


[0 0 0 ] 2 


9 12 15 4 

19 26 33 7 
0 0 0 2 


2 . 36 . Let M = diag(A, B, C), where A 


1 

3 


, B = [5], C 


1 

5 


3 

7 


Find M 2 . 


Because M is block diagonal, square each block: 


so 


7 10" 



16 

24" 

15 22 

B~ = [25], 

C~ = 

40 

64 



7 

10 

1 



15 

22 

1 


M 2 = 



r~ ~ 

•n 
<N 
U - 





i 16 

24 




i 40 

64 


Miscellaneous Problem 


2.37. Let f(x) and g(x) be polynomials and let A be a square matrix. Prove 

(a) (/ + g)(A) —f(A) + g(A), 

(b) (f-g)(A)=f(A)g(A), 

(c) f(A)g(A) = g(A)f(A). 

Suppose/(x) = E;=i and g(x) = E;=i V- 

(a) We can assume r — s = n by adding powers of x with 0 as their coefficients. Then 

f(x) + g(x) = XX«; + biW 

i = 1 

Hence, (/ + g) (A) = E(a,- + b,)A l = £ a t A‘ + £ M* = f( A ) + g(A) 

1=1 1=1 1=1 

(b) We ha vef(x)g(x) = Y^cijbjX 1 ^. Then 

f(A)g(A) = = E ajbjA'+J = (fg){A) 

(c) Using f(x)g(x) = g{x)f(x), we have 

/(A)g(A) = (fg)(A) = (gf)(A) = g(A)/(A) 
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SUPPLEMENTARY PROBLEMS 


Algebra of Matrices 

Problems 2.38-2.41 refer to the following matrices: 



2.38. Find (a) 5A - 2 B, (b) 2 A + 3 B, (c) 2C - 3D. 


2.39. Find (a) AB and (. AB)C, (b) BC and A{BC). [Note that (. AB)C = A(BC).] 

2.40. Find (a) A 2 and A 3 , (b) AD and BD, (c) CD. 


2.41. Find (a) A T , (b) B T , (c) (AB) r , (d) A T B T . [Note that A T B T ± {AB) T .] 
Problems 2.42 and 2.43 refer to the following matrices: 





C = 





2 . 42 . Find (a) 3A - AB, (b) AC, (c) BC, (d ) AD, (e) BD, (f) CD. 

2 . 43 . Find (a) A r , (b ) A T B, (c) A r C. 

r i 2 l 

2 . 44 . Let A = „ , . Find a 2 x 3 matrix B with distinct nonzero entries such that AB = 0. 

3 6 J 

Q.\ Q-2 ^3 ^4 

2.45 Let e x — [1,0,0], e 2 — [0,1,0], e 3 = [0,0,1], and A = b x b 2 b 3 b 4 . Find e x A, e 2 A, e 3 A. 

_ C 1 c 2 c 3 c 4_ 

2 . 46 . Let e, = [0,..., 0,1,0,..., 0], where 1 is the /th entry. Show 

(a) e,A = Aj, /th row of A. (c) If e f A = efi, for each i, then A = B. 

(b) BeJ = B', /th column of B. (d) If AeJ = BeJ, for each j, then A = B. 

2 . 47 . Prove Theorem 2.2(iii) and (iv): (iii) ( B + C)A = BA + CA, (iv) k(AB) = (, kA)B = A(kB). 

2 . 48 . Prove Theorem 2.3: (i) (A + B) T = A r + B r , (ii) ( A T ) T = A, (iii) (M) r = kA T . 

2 . 49 . Show (a) If A has a zero row, then AB has a zero row. (b) If B has a zero column, then AB has a zero 

column. 


Square Matrices, Inverses 

2.50. Find the diagonal and trace of each of the following matrices: 

'2-5 8] H 3 -4 

(a) A = 3 —6 —7 , (b) B — 6 1 7 

4 0 -lj L 2 “ 5 _1 

'2 

Problems 2.51-2.53 refer to A = ] 

2.51. Find (a) A 2 and A 3 , (b)f(A) and g(A), where 

f(x) — x 3 — lx 2 — 5, 




g(x) — x 2 — 3x + 17. 
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2.52. Find (a) B 2 and B\ (b )f(B) and g(B), where 

f(x ) = x 2 + 2x — 22, g(x) = x 2 — 3x — 6. 

2.53. Find a nonzero column vector u such that Cu = 4 u. 

2.54. Find the inverse of each of the following matrices (if it exists): 


A = 

'7 4' 
_5 3 _ 

5 

B = 

'2 

4 

3' 
5 _ 

5 

C = 

4 - 

-2 




'1 

1 

2' 



'1 - 

i r 

2.55. Find the inverses of A = 

1 

2 

5 

and B = 

0 

i -i 




1 

3 

7 



1 

3 -2 



-2 

-3 


[Hint: See Problem 2.19.] 


2.56. Suppose A is invertible. Show that if AB = AC, then B — C. Give an example of a nonzero matrix 
A such that AB —AC but B f C. 


2.57. Find 2x2 invertible matrices A and B such that A + B f 0 and A + B is not invertible. 

2.58. Show (a) A is invertible if and only if A T is invertible, (b) The operations of inversion and 
transpose commute; that is, {A T )~ = (A -1 ) . (c) If A has a zero row or zero column, then A is 
not invertible. 


Diagonal and triangular matrices 

2.59. Let A = diag(l,2, —3) and B = diag(2, —5,0). Find 

(a) AB,A 2 ,B 2 \ (b) f(A), where/(x) = x 2 + 4x — 3; (c) A -1 and B~ x . 


2.60. Let A = 


1 2 
0 1 


and B = 


1 1 0 
0 1 1 
0 0 1 


(a) Find A", (b) Find B". 


2.61. Find all real triangular matrices A such that A 2 = B, where (a) B = 


4 21 
0 25 


,(b )B 


2.62. Let A = 


5 2 
0 k 


. Find all numbers k for which A is a root of the polynomial: 


(a) f(x) = x 2 — lx + 10, (b) g(x) = x 2 — 25, (c) h(x) = x 2 — 4. 


1 4 
0 -9 


2.63. Let B = 


2.64. Let B = 


1 0 
26 27 

18 5 
0 9 5 
0 0 4 


Find a matrix A such that A 3 = B. 


. Find a triangular matrix A with positive diagonal entries such that A 2 = B. 


2.65. Using only the elements 0 and 1, find the number of 3 x 3 matrices that are (a) diagonal, 
(b) upper triangular, (c) nonsingular and upper triangular. Generalize to n x n matrices. 


2.66. Let D k = kl, the scalar matrix belonging to the scalar k. Show 

(a) D k A = kA, (b) BD k = kB, (c) D k + D k , = D k+k ,, (d) D k D k , = D kk , 

2.67. Suppose AB = C, where A and C are upper triangular. 

(a) Find 2x2 nonzero matrices A,B, C, where B is not upper triangular. 

(b) Suppose A is also invertible. Show that B must also be upper triangular. 
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Special Types of Real Matrices 

2 . 68 . Find x,y,z such that A is symmetric, where 



'2 

X 

3' 


'7 

-6 

2x 

(a) A = 

4 

5 

y 

(b) A - 

y 

z 

-2 


z 

1 

7 


X 

-2 

5 


2 . 69 . Suppose A is a square matrix. Show (a) A+ A T is symmetric, (b) A — A 7 is skew-symmetric, 
(c) A = B + C, where B is symmetric and C is skew-symmetric. 


2 . 70 . Write A = 


4 5 

1 3 


as the sum of a symmetric matrix B and a skew-symmetric matrix C. 


2 . 71 . Suppose A and B are symmetric. Show that the following are also symmetric: 

(a) A + B; (b) kA, for any scalar k; (c) A 2 ; 

(d) A", for n > 0; (e) /(A), for any polynomial/(x). 


2 . 72 . Find a 2 x 2 orthogonal matrix P whose first row is a multiple of 
(a) (3,-4), (b) (1,2). 

2 . 73 . Find a 3 x 3 orthogonal matrix P whose first two rows are multiples of 

(a) (1,2,3) and (0,-2,3), (b) (1,3,1) and (1,0,-1). 

2 . 74 . Suppose A and B are orthogonal matrices. Show that A 7 , A 1 , AB are also orthogonal. 

2 . 75 . Which of the following matrices are normal? A = 








"1 

1 

r 

'3 

—4 

,B = 

"1 

-2 

, C — 

0 

1 

i 

4 

3 

2 

3 







_0 

0 

i 


Complex Matrices 

2 . 76 . Find real numbers x,y,z such that A is Hermitian, where A = 


3 

3-2 i 
yi 


x + 2 i yi 

0 1 +zi 

1 — xi — 1 


2 . 77 . Suppose A is a complex matrix. Show that AA H and A A are Flermitian. 

2 . 78 . Let A be a square matrix. Show that (a) A + A H is Hermitian, (b) A — A 11 is skew-Hermitian, 
(c) A — B + C, where B is Hermitian and C is skew-Hermitian. 


2 . 79 . Determine which of the following matrices are unitary: 


A = 


i/2 -x/3/2 
L x/3/2 —i/2 


1 

B = - 
2 


1 i 1 — i 
1 — i 1 H - i 


1 

C =2 


l 

i 1 
1 i — 1 i 


—i — lH - i 
1 + i 
0 


2 . 80 . Suppose A and B are unitary. Show that A", A 1 , AB are unitary. 


2 . 81 . Determine which of the following matrices are normal: A = 
1 O' 


B = 


1 — i i 


3 + 4 i 1 

i 2 + 3 i 


and 
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Block Matrices 





3 -2 i 0 0 

Let U = 

1 2 '0 0 0 

3 4 _[ 0 0 0 

and V = 

2 4 [ 0 0 

0 Oil 2 

0 0 1 5 1 2 

0 0 [3 4 1 

0 0 2 -3 




] 

O 

o 


(a) Find UV using block multiplication, (b) Are U and V block diagonal matrices? 
(c) Is UV block diagonal? 


2.83. Partition each of the following matrices so that it becomes a square block matrix with as many 
diagonal blocks as possible: 


A = 





'1 

2 

0 

0 

o' 




"1 

0 

o' 


3 

0 

0 

0 

0 


'0 

1 

o' 

0 

0 

2 

, B = 

0 

0 

4 

0 

0 

, C = 

0 

0 

0 

0 

0 

3 


0 

0 

5 

0 

0 


2 

0 

0 




0 

0 

0 

0 

6 





2.84. Find M 2 and M 3 for (a) M = 


2 

J° 

1 1 

0 

1 O' 


'1 

1 

;0 

O' 

0 

“4 

1- 

1 o 

, (b) M = 

2 

3 

i0 

0 

0 

2 

_ 1_ 

1 0 

0 

0 

'i 

2 

0 

!o 

0 

r 3 


0 

0 

i 

i4 

5 


2.85. For each matrix M in Problem 2.84, find f(M) where/(x) = x 2 + 4x — 5. 


2.86. Suppose U = [U ik ] and V = [V kj \ are block matrices for which UV is defined and the number of 
columns of each block U ik is equal to the number of rows of each block V kj . Show that UV = [ Wy \. 
where Wy = J2 k U ikVkj- 

2.87. Suppose M and N are block diagonal matrices where corresponding blocks have the same size, say 
M = diag(A ; ) and N = diag(6,). Show 

(i) M + N = diag(A,- + (iii) MN = diag (A,B,), 

(ii) kM = diag(M ; ), (iv) f(M) = diag (/(A,-)) for any polynomial /(x). 


ANSWERS TO SUPPLEMENTARY PROBLEMS 


Notation: A = [R { ; R 2 \ ...] denotes a matrix A with rows 

2.38. (a) [-5,10; 27,-34], (b) [17,4; -12,13], (c) [-7,-27,11; -8,36,-37] 

2.39. (a) [-7,14; 39,-28], [21,105,-98; -17,-285,296] 

(b) [5,-15,20; 8,60,-59], [21,105,-98; -17,-285,296] 

2.40. (a) [7,-6; -9,22], [-11,38; 57,-106]; 

(b) [11,-9,17; -7,53,-39], [15,35,-5; 10,-98,69]; (c) not defined 

2.41. (a) [1,3; 2,-4], (b) [5,-6; 0,7], (c) [-7,39; 14,-28], (d) [5,15; 10,-40] 

2.42. (a) [-13,-3,18; 4,17,0], (b) [-5,-2,4,5; 11,-3,-12,18], 

(c) [11,-12,0,-5; -15,5,8,4], (d) [9; 9], (e) [-1; 9], 


(f) not defined 
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2.43. (a) [1,0; -1,3; 2,4], (b) [4,0,-3; -7,-6,12; 4,-8,6], (c) not defined 

2.44. [2,4,6; -1,-2,-3] 

2.45. [a ] ,a 2 ,a 3 ,a 4 }, [b u b 2 ,b 3 ,b 4 \, [c u c 2 ,c 3 ,c 4 ] 

2.50. (a) 2, —6, — l,tr(A) = —5, (b) 1,1, — l,tr(fi) = 1, (c) not defined 

2.51. (a) [-11,-15; 9,-14], [-67,40; -24,-59], (b) [-50,70; -42, -36], g(A) = 0 

2.52. (a) [14,4; -2,34], [60,-52; 26,-200], (b) f{B) = 0, [-4,10; -5,46] 

2.53. u = [2a, a\ T 

2.54. [3,-4; —5,7], [—1,|; 2,—1], not defined, [1, — 2, — |] 

2.55. [1,1,-1; 2,-5,3; -1,2,-1], [1,1,0; -1,-3,1; -1,-4,1] 

2.56. A = [1,2; 1,2], B =[0,0; 1,1], C=[2,2; 0,0] 

2.57. A = [1,2; 0,3]; 5= [4,3; 3,0] 

2.58. (c) Hint: Use Problem 2.48 

2.59. (a) AB = diag(2, —10,0), A 2 = diag(l,4,9), B 2 = diag(4,25,0); 

(b) /'(A) = diag(2,9, —6); (c) A -1 = diag(l,j, — j), 5 1 does not exist 

2.60. (a) [l,2n; 0,1], (b) [l,n,±n(n-1); 0,1,n; 0,0,1] 

2.61. (a) [2,3; 0,5], [-2,-3; 0,-5], [2,-7; 0,-5], [-2,7; 0,5], (b) none 

2.62. (a) 1 = 2, (b) k = —5, (c) none 

2.63. [1,0; 2,3] 

2.64. [1,2,1; 0,3,1; 0,0,2] 

2.65. All entries below the diagonal must be 0 to be upper triangular, and all diagonal entries must be 1 to 
be nonsingular. 

(a) 8 (2"), (b) 2 6 (2 ,! («+ 1 )/ 2 ), ( c ) 2 3 (2"(”“ 1 )/ 2 ) 

2.67. (a) A =[1,1; 0,0], B= [1,2; 3,4], C= [4,6; 0,0] 

2.68. (a) x = 4, y — 1, z — 3; (b) x = 0, y = —6, z any real number 

2.69. (c) Hint: Let B = \ (A + A T ) and C = \ (A - A T ). 

2.70. B = [4,3; 3,3], C = [0,2; -2,0] 

2.72. (a) [f, -f; f, |], (b) [l/x/5, 2/^/5; 2/V5, -l/y/5\ 

2.73. (a) [1/VT4, 2/VU, 2>/VU\ 0, — 2/a/I 3, 3/>/l3; 12/VT57, -3/^157, -2/ v / l57] 

(b) [1/VTT, 3 /vTI, 1/vTT; 1/V5, 0,-1/V^; 3/V^2,-2/^, 3/V^2] 

2.75. A, C 
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2.76. x = 3, y = 0, z — 3 

2.78. (c) Hint: Let B = \{A+ A H ) and C = ^(A - A"). 

2.79. A,B,C 

2.81. A 

2.82. (a) UV — diag([7,6; 17,10]; [-1,9; 7,-5]); (b) no; (c) yes 

2.83. A: line between first and second rows (columns); 

B: line between second and third rows (columns) and between fourth and fifth rows (columns); 
C: C itself—no further partitioning of C is possible. 

2.84. (a) M 2 = diag([4], [9,8; 4,9], [9]), 

M 3 = diag([8], [25,44; 22,25], [27]) 

(b) M 2 — diag([3,4; 8,11], [9,12; 24,33]) 

M 3 — diag([ll, 15; 30,41], [57,78; 156,213]) 

2.85. (a) diag([7], [8,24; 12,8], [16]), (b) diag([2,8; 16,181], [8,20; 40,48]) 




Systems of Linear 

Equations 


3.1 Introduction 


Systems of linear equations play an important and motivating role in the subject of linear algebra. In fact, 
many problems in linear algebra reduce to finding the solution of a system of linear equations. Thus, the 
techniques introduced in this chapter will be applicable to abstract ideas introduced later. On the other 
hand, some of the abstract results will give us new insights into the structure and properties of systems of 
linear equations. 

All our systems of linear equations involve scalars as both coefficients and constants, and such scalars 
may come from any number field K. There is almost no loss in generality if the reader assumes that all our 
scalars are real numbers—that is, that they come from the real field R. 


3.2 Basic Definitions, Solutions 


This section gives basic definitions connected with the solutions of systems of linear equations. The actual 
algorithms for finding such solutions will be treated later. 

Linear Equation and Solutions 

A linear equation in unknowns x 1 ,x 2 ,... ,x n is an equation that can be put in the standard form 

a l x 1 + a 2 x 2 -1-1 -a n x„ — b (3.1) 

where a l ,a 2 , ■ ■ ■, a n , and b are constants. The constant a k is called the coefficient of x k , and b is called the 
constant term of the equation. 

A solution of the linear equation (3.1) is a list of values for the unknowns or, equivalently, a vector u in 
K", say 

X\ — ki, x 2 = k 2 , x n =k n or u = (k u fc 2 ,... ,k n ) 

such that the following statement (obtained by substituting k t for x, in the equation) is true: 


a x k x + a 2 k 2 + • • • + a n k n — b 


In such a case we say that u satisfies the equation. 

Remark: Equation (3.1) implicitly assumes there is an ordering of the unknowns. In order to avoid 
subscripts, we will usually use x,y for two unknowns; x,y,z for three unknowns; and x,y,z,t for four 
unknowns; they will be ordered as shown. 
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EXAMPLE 3.1 Consider the following linear equation in three unknowns x,y,z. 
x + 2y — 3z = 6 

We note that x = 5,y = 2, z = 1, or, equivalently, the vector u = (5,2,1) is a solution of the equation. That is, 

5 + 2(2)-3(1) = 6 or 5 + 4-3 = 6 or 6 = 6 

On the other hand, w = (1,2,3) is not a solution, because on substitution, we do not get a true statement: 

1+2(2)-3(3) = 6 or l+4-9 = 6 or -4 = 6 

System of Linear Equations 

A system of linear equations is a list of linear equations with the same unknowns. In particular, a system of 
m linear equations L l ,L 2 ,..., L m in n unknowns x 1 ,x 2 , ■ ■ ., x„ can be put in the standard form 

«n*i + «t 2*2 + •'' + a hl x„ = b x 
a 2l x l + a 22 x 2 + ■ • • + ri 2 „x n = b 2 


a ml x l 4” O m 2 x 2 + ' ■ ■ + u mn x n b m 

where the a (/ - and b f are constants. The number o (; is the coefficient of the unknown x, in the equation L r 
and the number /?, is the constant of the equation L r 

The system (3.2) is called an m x n (read: m by n) system. It is called a square system if m = n —that is, 
if the number m of equations is equal to the number n of unknowns. 

The system (3.2) is said to be homogeneous if all the constant terms are zero—that is, if b x = 0, 
b 2 = 0,..., b m = 0. Otherwise the system is said to be nonhomogeneous. 

A solution (or a particular solution) of the system (3.2) is a list of values for the unknowns or, 
equivalently, a vector u in K". which is a solution of each of the equations in the system. The set of all 
solutions of the system is called the solution set or the general solution of the system. 

EXAMPLE 3.2 Consider the following system of linear equations: 

X] + x 2 + 4x 3 + 3^4 = 5 
2x] + 3 x 2 + x 3 — 2 x 4 = 1 
X] + 2x 2 — 5x 3 + 4x 4 = 3 

It is a 3 x 4 system because it has three equations in four unknowns. Determine whether (a) u = (—8, 6,1,1) and 
(b) v= (—10,5,1,2) are solutions of the system. 

(a) Substitute the values of u in each equation, obtaining 

—8 + 6 + 4(1) + 3(1) = 5 or —8 + 6 + 4 + 3 = 5 or 5 = 5 
2(-8) + 3(6) + 1 -2(1) = 1 or -16+18 + 1-2 = 1 or 1 = 1 
-8+ 2(6)-5(1)+4(1) = 3 or -8+12-5 + 4 = 3 or 3 = 3 

Yes, u is a solution of the system because it is a solution of each equation. 

(b) Substitute the values of v into each successive equation, obtaining 

-10 +5+ 4(1)+ 3(2) = 5 or -10 + 5 + 4 + 6 = 5 or 5 = 5 
2(—10)+ 3(5)+ 1-2(2) = 1 or -20+15 + 1-4=1 or -8 = 1 

No, v is not a solution of the system, because it is not a solution of the second equation. (We do not need to 
substitute v into the third equation.) 
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The system (3.2) of linear equations is said to be consistent if it has one or more solutions, and it is 
said to be inconsistent if it has no solution. If the field K of scalars is infinite, such as when K is the real 
field R or the complex field C, then we have the following important result. 

THEOREM 3.1: Suppose the field K is infinite. Then any system J£ of linear equations has 
(i) a unique solution, (ii) no solution, or (iii) an infinite number of solutions. 

This situation is pictured in Fig. 3-1. The three cases have a geometrical description when the system -£ 
consists of two equations in two unknowns (Section 3.4). 



Figure 3-1 


Augmented and Coefficient Matrices of a System 

Consider again the general system (3.2) of m equations in n unknowns. Such a system has associated with 
it the following two matrices: 



a \\ 

a l2 

■ a \n 

b\ 



a u 

a l2 

■ a \n 

M = 

a 2\ 

a 22 

■ a 2n 

b 2 

and 

A = 

a 2l 

a 22 

■ a 2n 


_«ml 

a m2 ■ 

• ^mn 

b n_ 



_ a ml 

a m2 • 

amn _ 


The first matrix M is called the augmented matrix of the system, and the second matrix A is called the 
coefficient matrix. 

The coefficient matrix A is simply the matrix of coefficients, which is the augmented matrix M without 
the last column of constants. Some texts write M = [A,B] to emphasize the two parts of M, where B 
denotes the column vector of constants. The augmented matrix M and the coefficient matrix A of the 
system in Example 3.2 are as follows: 



'1 

1 

4 

3 

5' 



1 

1 

4 

3' 

M = 

2 

3 

1 

-2 

1 

and 

A = 

2 

3 

1 

-2 


1 

2 

-5 

4 

3 



1 

2 

-5 

4 


As expected, A consists of all the columns of M except the last, which is the column of constants. 

Clearly, a system of linear equations is completely determined by its augmented matrix M, and vice 
versa. Specifically, each row of M corresponds to an equation of the system, and each column of M 
corresponds to the coefficients of an unknown, except for the last column, which corresponds to the 
constants of the system. 

Degenerate Linear Equations 

A linear equation is said to be degenerate if all the coefficients are zero—that is, if it has the form 


Oxj + 0x 2 + • • • + 0x n — b 


(3.3) 
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The solution of such an equation depends only on the value of the constant b. Specifically, 

(i) If b 0. then the equation has no solution. 

(ii) If b = 0, then every vector u — (k x , k 2 ,..., k n ) in K" is a solution. 

The following theorem applies. 

THEOREM 3.2: Let be a system of linear equations that contains a degenerate equation L, say with 
constant b. 

(i) If b 0. then the system '£ has no solution. 

(ii) If b = 0, then L may be deleted from the system without changing the solution set 

of the system. 

Part (i) comes from the fact that the degenerate equation has no solution, so the system has no solution. 
Part (ii) comes from the fact that every element in K" is a solution of the degenerate equation. 


Leading Unknown in a Nondegenerate Linear Equation 

Now let L be a nondegenerate linear equation. This means one or more of the coefficients of L are not zero. 
By the leading unknown of L, we mean the first unknown in L with a nonzero coefficient. For example, x 3 
and y are the leading unknowns, respectively, in the equations 

Ox! + 0x 2 + 5x 3 + 6x 4 + 0x 5 + 8x 6 = 7 and Ox + 2y — 4z — 5 

We frequently omit terms with zero coefficients, so the above equations would be written as 

5x 3 + 6x4 + 8 x 6 = 7 and 2y — Az = 5 

In such a case, the leading unknown appears first. 


3.3 Equivalent Systems, Elementary Operations 


Consider the system (3.2) of m linear equations in n unknowns. Let L be the linear equation obtained by 
multiplying the m equations by constants c l ,c 2 , ■ ■ ■ ,c m , respectively, and then adding the resulting 
equations. Specifically, let L be the following linear equation: 

(o a n + • • • + c m a ml )x l + • ■ ■ + {c x a u + • • • + c m a mn )x n = c x b { + • • ■ + c m b m 

Then L is called a linear combination of the equations in the system. One can easily show (Problem 3.43) 
that any solution of the system (3.2) is also a solution of the linear combination L. 

EXAMPLE 3.3 Let L x , L 2 , L 3 denote, respectively, the three equations in Example 3.2. Let L be the 
equation obtained by multiplying L x , L 2 , L 3 by 3, —2,4, respectively, and then adding. Namely, 

3Lj: 3xq T 3xo T 12x 3 T 9 x 4 — 15 

—2 L 2 : — 4x, — 6x 2 — 2x 3 + 4x 4 = —2 

4L|: 4x[ + 8x 2 — 20x 3 + 16x 4 = 12 


(Sum) L: 


3xj + 5 x 2 — 10x 3 + 29 x 4 = 25 
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Then L is a linear combination of Lj, L 2 , L 3 . As expected, the solution u = (—8,6, 1,1) of the system is also a 
solution of L. That is, substituting u in L, we obtain a true statement: 

3(-8) + 5(6)- 10(1)+ 29(1) = 25 or -24 + 30- 10 + 29 = 25 or 9 = 9 
The following theorem holds. 

THEOREM 3.3: Two systems of linear equations have the same solutions if and only if each equation in 
each system is a linear combination of the equations in the other system. 

Two systems of linear equations are said to be equivalent if they have the same solutions. The next 
subsection shows one way to obtain equivalent systems of linear equations. 

Elementary Operations 

The following operations on a system of linear equations L x ,L lr - .. ,L m are called elementary operations. 

[Ej ] Interchange two of the equations. We indicate that the equations L, and Lj are interchanged by 
writing: 

“Interchange L l and Lj' or “L, <—> L” 

[E 2 ] Replace an equation by a nonzero multiple of itself. We indicate that equation L, is replaced by kL { 
(where k ^ 0) by writing 

“Replace L, by kL” or “ kL t —> L” 

[E 3 ] Replace an equation by the sum of a multiple of another equation and itself. We indicate that 
equation L ; is replaced by the sum of kL t and L ; by writing 

“Replace Lj by kL ( + L j or “kL t + Lj —» Lj' 

The arrow —> in [E 2 ] and [E 3 ] may be read as “replaces.” 

The main property of the above elementary operations is contained in the following theorem (proved 
in Problem 3.45). 

THEOREM 3.4: Suppose a system of ,Ji of linear equations is obtained from a system of linear 
equations by a finite sequence of elementary operations. Then Jt and have the same 
solutions. 

Remark: Sometimes (say to avoid fractions when all the given scalars are integers) we may apply 
[E 2 ] and [E 3 ] in one step; that is, we may apply the following operation: 

[E] Replace equation L ; - by the sum of kL t and k'L / (where k! 0), written 

“Replace Lj by kL t + k’L” or “kZ,- + k'Lj —> Lj' 

We emphasize that in operations [E 3 ] and [E], only equation L ; is changed. 

Gaussian elimination, our main method for finding the solution of a given system of linear 
equations, consists of using the above operations to transform a given system into an equivalent 
system whose solution can be easily obtained. 


The details of Gaussian elimination are discussed in subsequent sections. 


3.4 Small Square Systems of Linear Equations 


This section considers the special case of one equation in one unknown, and two equations in two 
unknowns. These simple systems are treated separately because their solution sets can be described 
geometrically, and their properties motivate the general case. 
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Linear Equation in One Unknown 

The following simple basic result is proved in Problem 3.5. 

THEOREM 3.5: Consider the linear equation ax = b. 

(i) If a ^ 0, then x — b/a is a unique solution of ax = b. 

(ii) If a — 0, but b ^ 0, then ax = b has no solution. 

(iii) If a — 0 and b = 0, then every scalar k is a solution of ax = b. 

EXAMPLE 3.4 Solve (a) Ax — 1 = x + 6, (b) 2x — 5 — x = x + 3, (c) 4 + x — 3 = 2x + 1 — x. 

(a) Rewrite the equation in standard form obtaining 3x = 7. Then x = | is the unique solution [Theorem 3.5(i)]. 

(b) Rewrite the equation in standard form, obtaining O.r = 8. The equation has no solution [Theorem 3.5(ii)]. 

(c) Rewrite the equation in standard form, obtaining O.r = 0. Then every scalar k is a solution [Theorem 3.5(iii)]. 


System of Two Linear Equations in Two Unknowns (2x2 System) 

Consider a system of two nondegenerate linear equations in two unknowns x and y, which can be put in the 
standard form 


A { x + B x y — C[ 
A 2 x + B 2 y — C 2 


(3.4) 


Because the equations are nondegenerate, A , and B ] are not both zero, and A 2 and B 2 are not both zero. 

The general solution of the system (3.4) belongs to one of three types as indicated in Fig. 3-1. If R is the 
field of scalars, then the graph of each equation is a line in the plane R 2 and the three types may be 
described geometrically as pictured in Fig. 3-2. Specifically, 

(1) The system has exactly one solution. 

Here the two lines intersect in one point [Fig. 3-2(a)]. This occurs when the lines have distinct slopes 
or, equivalently, when the coefficients of x and y are not proportional: 

A B 

~T^TT or - equivalently, A , B 2 - A 2 B X ± 0 
For example, in Fig. 3-2(a), 1/3 ^ —1/2. 



L 2 : 3x + 2y = 12 L 2 : 2x + 6y = — 8 

(a) (b) 



Lf x + 2y = 4 
L 2 : 2x + 4y = 8 

(c) 


Figure 3-2 
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(2) The system has no solution. 

Here the two lines are parallel [Fig. 3-2(b)]. This occurs when the lines have the same slopes but 
different y intercepts, or when 

^t _ _j_ Q 

a 2 b 2 C 2 

For example, in Fig. 3-2 (b), 1/2 = 3/6 —3/8. 

(3) The system has an infinite number of solutions. 

Here the two lines coincide [Fig. 3-2(c)]. This occurs when the lines have the same slopes and same y 
intercepts, or when the coefficients and constants are proportional, 

_ Q 
a 2 b 2 C 2 

For example, in Fig. 3-2(c), 1/2 = 2/4 = 4/8. 

Remark: The following expression and its value is called a determinant of order two : 

A } B ' =A 1 B 2 -A 2 B l 

a 2 b 2 

Determinants will be studied in Chapter 8. Thus, the system (3.4) has a unique solution if and only if the 
determinant of its coefficients is not zero. (We show later that this statement is true for any square system 
of linear equations.) 


Elimination Algorithm 

The solution to system (3.4) can be obtained by the process of elimination, whereby we reduce the system 
to a single equation in only one unknown. Assuming the system has a unique solution, this elimination 
algorithm has two parts. 

ALGORITHM 3.1: The input consists of two nondegenerate linear equations L x and L 2 in two unknowns 
with a unique solution. 

Part A. (Forward Elimination) Multiply each equation by a constant so that the resulting coefficients of 
one unknown are negatives of each other, and then add the two equations to obtain a new 
equation L that has only one unknown. 

Part B. (Back-Substitution) Solve for the unknown in the new equation L (which contains only one 
unknown), substitute this value of the unknown into one of the original equations, and then 
solve to obtain the value of the other unknown. 

Part A of Algorithm 3.1 can be applied to any system even if the system does not have a unique 
solution. In such a case, the new equation L will be degenerate and Part B will not apply. 

EXAMPLE 3.5 (Unique Case). Solve the system 
Ly. lx — 3y = -8 
L 2 : 3x + 4y— 5 

The unknown x is eliminated from the equations by forming the new equation L = — 3Lj + 2 L 2 . That is, we 
multiply L, by —3 and L 2 by 2 and add the resulting equations as follows: 

—3 Ly: —6.x + 9y = 24 

2 L 2 : 6x+8v=10 


Addition : 


17y = 34 
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We now solve the new equation for y, obtaining y = 2. We substitute y = 2 into one of the original equations, say L,, 
and solve for the other unknown x, obtaining 

2x — 3(2) = —8 or 2x — 6 = 8 or 2x = —2 or x = — 1 

Thus, x = — 1, y = 2, or the pair u = (—1,2) is the unique solution of the system. The unique solution is expected, 
because 2/3 f —3/4. [Geometrically, the lines corresponding to the equations intersect at the point (—1,2).] 

EXAMPLE 3.6 (Nonunique Cases) 

(a) Solve the system 

L [: x — 3y = 4 

L 2 : —2x + 6y = 5 

We eliminated x from the equations by multiplying L x by 2 and adding it to L 2 —that is, by forming the new 
equation L = 2 L x + L 2 . This yields the degenerate equation 

Ox + 0 y = 13 

which has a nonzero constant b = 13. Thus, this equation and the system have no solution. This is expected, 
because 1 /(—2) = —3/6 ^ 4/5. (Geometrically, the lines corresponding to the equations are parallel.) 

(b) Solve the system 

Lj: x — 3y = 4 

L 2 : —2x + 6 y = —8 

We eliminated x from the equations by multiplying L x by 2 and adding it to L 2 — that is, by forming the new 
equation L = 2L + L 2 . This yields the degenerate equation 

Ox + Oy = 0 

where the constant term is also zero. Thus, the system has an infinite number of solutions, which correspond to 
the solutions of either equation. This is expected, because 1 /(—2) = —3/6 = 4/(—8). (Geometrically, the lines 
corresponding to the equations coincide.) 

To find the general solution, let y = a, and substitute into L { to obtain 

x — 3a — 4 or x = 3a + 4 

Thus, the general solution of the system is 

x = 3n + 4, _y = a or u — (3a + 4, a) 
where a (called a parameter) is any scalar. 

3.5 Systems in Triangular and Echelon Forms 

The main method for solving systems of linear equations, Gaussian elimination, is treated in Section 3.6. 
Here we consider two simple types of systems of linear equations: systems in triangular form and the more 
general systems in echelon form. 

Triangular Form 

Consider the following system of linear equations, which is in triangular form: 

2x] — 3 x 2 + 5x 3 — 2x 4 = 9 
5x 2 — x 3 + 3 x 4 = 1 
7x 3 — x 4 = 3 
2x 4 = 8 
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That is, the first unknown x, is the leading unknown in the first equation, the second unknown x 2 is the 
leading unknown in the second equation, and so on. Thus, in particular, the system is square and each 
leading unknown is directly to the right of the leading unknown in the preceding equation. 

Such a triangular system always has a unique solution, which may be obtained by back-substitution. 
That is, 

(1) First solve the last equation for the last unknown to get x 4 = 4. 

(2) Then substitute this value x 4 = 4 in the next-to-last equation, and solve for the next-to-last unknown 
x 3 as follows: 

7x 3 —4 = 3 or 7x 3 =7 or x 3 = 1 

(3) Now substitute x 3 = 1 and x 4 = 4 in the second equation, and solve for the second unknown x 2 as 
follows: 

5x 2 — 1 + 12 = 1 or 5x 2 + 11 = 1 or 5x 2 = —10 or x 2 = —2 

(4) Finally, substitute x 2 = —2, x 3 = 1, x 4 = 4 in the first equation, and solve for the first unknown x 1 as 
follows: 

2xj + 6 + 5 — 8 = 9 or 2x, + 3 = 9 or 2x l = 6 or x, — 3 

Thus, X, = 3 , x 2 = —2, x 3 = 1, x 4 = 4, or, equivalently, the vector u = (3, —2,1,4) is the unique solution 
of the system. 

Remark: There is an alternative form for back-substitution (which will be used when solving a 
system using the matrix format). Namely, after first finding the value of the last unknown, we substitute 
this value for the last unknown in all the preceding equations before solving for the next-to-last 
unknown. This yields a triangular system with one less equation and one less unknown. For example, in 
the above triangular system, we substitute x 4 = 4 in all the preceding equations to obtain the triangular 
system 

2x] — 3x 2 + 5x 3 = 17 
5x 2 — x 3 = — 1 
7x 3 = 7 

We then repeat the process using the new last equation. And so on. 


Echelon Form, Pivot and Free Variables 

The following system of linear equations is said to be in echelon form: 

2x] + 6x 2 — x 3 + 4x 4 — 2x 5 = 15 
x 3 + 2x 4 + 2x 5 = 5 
3x 4 — 9x 5 = 6 

That is, no equation is degenerate and the leading unknown in each equation other than the first is to the 
right of the leading unknown in the preceding equation. The leading unknowns in the system, x 1? x 3 , x 4 , are 
called pivot variables, and the other unknowns, x 2 and x 5 , are called free variables. 

Generally speaking, an echelon system or a system in echelon form has the following form: 

a n x x + a 12 x 2 + u 13 x 3 + o 14 x 4 + ■ ■ • + a ln x n = b l 

a 2h X h + a 2j 2 +l X h+l + ' ' ' + « 2 n*n = b 2 (3 5 ) 

a rj r X j r + ' ' ' + a m x n = K 

where 1 < j 2 < ■ ■ ■ < j r and u n , a 2 j 2 , ■ ■ ■ ,a r j are not zero. The pivot variables are x,, x, ,... ,x,. Note 
that r < n. 

The solution set of any echelon system is described in the following theorem (proved in Problem 3.10). 
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THEOREM 3.6: Consider a system of linear equations in echelon form, say with r equations in n 
unknowns. There are two cases: 

(i) r = n. That is, there are as many equations as unknowns (triangular form). Then 
the system has a unique solution. 

(ii) r < n. That is, there are more unknowns than equations. Then we can arbitrarily 
assign values to the n — r free variables and solve uniquely for the r pivot 
variables, obtaining a solution of the system. 

Suppose an echelon system contains more unknowns than equations. Assuming the field K is infinite, 
the system has an infinite number of solutions, because each of the n — r free variables may be assigned 
any scalar. 

The general solution of a system with free variables may be described in either of two equivalent ways, 
which we illustrate using the above echelon system where there are r = 3 equations and n = 5 unknowns. 
One description is called the “Parametric Form” of the solution, and the other description is called the 
“Free-Variable Form.” 

Parametric Form 

Assign arbitrary values, called parameters, to the free variables x 2 and x 5 , say x 2 = a and x 5 = 6, and then 
use back-substitution to obtain values for the pivot variables x ( , x 3 , x 5 in terms of the parameters a and b. 
Specifically, 

(1) Substitute x 5 = b in the last equation, and solve for x 4 : 

3x 4 — 9b = 6 or 3x 4 = 6 + 9 b or x 4 = 2 + 3 b 

(2) Substitute x 4 = 2 + 3b and x 5 = 6 into the second equation, and solve for x 3 : 
x 3 + 2(2 + 3b) + 2b = 5 or x 3 + 4 + 86 = 5 or x 3 = 1 — 86 

(3) Substitute x 2 = a, x 3 = 1 — 86, x 4 = 2 + 36, x 5 = 6 into the first equation, and solve for x,: 

2x| + 6a — (1 — 86) + 4(2 + 36) — 26 = 15 or x 1 = 4 — 3a — 96 
Accordingly, the general solution in parametric form is 

x l =4 — 3 a — 96, x 2 = a, x 3 = 1 — 86, x 4 = 2 + 36, x 5 = 6 

or, equivalently, v = (4 — 3a — 96, a. 1 — 86, 2 + 36, 6) where a and 6 are arbitrary numbers. 

Free-Variable Form 

Use back-substitution to solve for the pivot variables x,, x 3 , x 4 directly in terms of the free variables x 2 and 
x 5 . That is, the last equation gives x 4 = 2 + 3x 5 . Substitution in the second equation yields x 3 = 1 — 8x 5 , 
and then substitution in the first equation yields x, = 4 — 3x 2 — 9x 5 . Accordingly, 

x, = 4 — 3x 2 — 9x 5 , x 2 = free variable, x 3 = 1 — 8x 5 , x 4 = 2 + 3x 5 , x 5 = free variable 

or, equivalently, 

v = (4 — 3x 2 — 9x 5 , x 2 , 1 — 8x 5 , 2 + 3x 5 , x 5 ) 

is the free-variable form for the general solution of the system. 

We emphasize that there is no difference between the above two forms of the general solution, and the 
use of one or the other to represent the general solution is simply a matter of taste. 

Remark: A particular solution of the above system can be found by assigning any values to the free 
variables and then solving for the pivot variables by back-substitution. For example, setting x 2 = 1 and 
x 5 = 1, we obtain 

x 4 = 2 + 3 = 5, x 3 = l— 8 = —7, x 3 = 4 — 3 — 9 = — 8 
Thus, u = (—8,1,7,5,1) is the particular solution corresponding to x 2 = 1 and x 5 = 1. 
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3.6 Gaussian Elimination 

The main method for solving the general system (3.2) of linear equations is called Gaussian elimination. 
It essentially consists of two parts: 

Part A. (Forward Elimination) Step-by-step reduction of the system yielding either a degenerate 
equation with no solution (which indicates the system has no solution) or an equivalent simpler 
system in triangular or echelon form. 

Part B. (Backward Elimination) Step-by-step back-substitution to find the solution of the simpler 
system. 

Part B has already been investigated in Section 3.4. Accordingly, we need only give the algorithm for 
Part A, which is as follows. 

ALGORITHM 3.2 for (Part A): Input: The m x n system (3.2) of linear equations. 

ELIMINATION STEP: Find the first unknown in the system with a nonzero coefficient (which now 

must be x x ). 

(a) Arrange so that =/=■ 0. That is, if necessary, interchange equations so that the first unknown x 1 
appears with a nonzero coefficient in the first equation. 

(b) Use a n as a pivot to eliminate x 1 from all equations except the first equation. That is, for i > 1: 

(1) Set m — -a n /a n ] (2) Replace L, by mL x + L, 

The system now has the following form: 

a n x i + a n x 2 + < 213*3 + •'' + a u x„ = b t 
«2 j 2 Xj 2 + • ■ ■ + a 2n Xn = b 2 


a mj 2 X j 2 T ■ ■ ■ + u mn x n b n 

where x, does not appear in any equation except the first, a,, 0 , and xj denotes the first 

unknown with a nonzero coefficient in any equation other than the first. 

(c) Examine each new equation L. 

(1) If L has the form Ox, + 0x 2 4-+ 0x n = b with b =/=■ 0, then 

STOP 

The system is inconsistent and has no solution. 

(2) If L has the form Ox, + 0x 2 -f-f 0x )( = 0 or if L is a multiple of another equation, then delete L 

from the system. 

RECURSION STEP: Repeat the Elimination Step with each new “smaller” subsystem formed by all the 
equations excluding the first equation. 

OUTPUT: Finally, the system is reduced to triangular or echelon form, or a degenerate equation with no 
solution is obtained indicating an inconsistent system. 

The next remarks refer to the Elimination Step in Algorithm 3.2. 

(1) The following number m in (b) is called the multiplier : 

a n coefficient to be deleted 
a n pivot 

(2) One could alternatively apply the following operation in (b): 

Replace L, by — a n L l + a ll L i 

This would avoid fractions if all the scalars were originally integers. 
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Gaussian Elimination Example 

Here we illustrate in detail Gaussian elimination using the following system of linear equations: 

L,: x — 3y — 2z— 6 

L 2 : 2x — 4y — 3z — 8 

L 3 : — 3x + 6 y + 8 z=— 5 

Part A. We use the coefficient 1 of x in the first equation L x as the pivot in order to eliminate x from 
the second equation L 2 and from the third equation L 3 . This is accomplished as follows: 

(1) Multiply L, by the multiplier m = —2 and add it to L 2 ; that is, “Replace L 2 by — 2L X + L 2 .” 

(2) Multiply L, by the multiplier m = 3 and add it to L 3 ; that is, “Replace L 3 by 3 L x + L 3 .” 

These steps yield 

(- 2 )L,: -2x+6y + 4z= -12 3 Lj: 3x-9y-6z=18 

L 2 . 2x — 4y — 3z= 8 L 3 : -3x + 6y + 8 z = -5 

New L 2 : 2y + z=~ 4 New L 3 : -3y + 2z = 13 

Thus, the original system is replaced by the following system: 

L[: x — 3 y — 2 z= 6 

L 2 : 2y + z — -4 

L 3 : —3y + 2z= 13 

(Note that the equations L 2 and L 3 form a subsystem with one less equation and one less unknown than the 
original system.) 

Next we use the coefficient 2 of y in the (new) second equation L 2 as the pivot in order to eliminate y 
from the (new) third equation L 3 . This is accomplished as follows: 

(3) Multiply L 2 by the multiplier m =| and add it to L 3 ; that is, “Replace L 3 by 2 L 2 + L x .” 
(Alternately, “Replace L 3 by 3 L 2 + 2 L 3 ,” which will avoid fractions.) 

This step yields 

l L 2 - 3y + |z=—6 3 L 2 : 6 y + 3z = —12 

L 3 : -3y + 2z, = 13 ^ 2 L 3 : -6y + 4z= 26 

New L 3 : \z= 7 New L 3 : l z = 14 

Thus, our system is replaced by the following system: 

L[: x — 3 y — 2 Z = 6 

L 2 : 2 v + z— ~4 

L 3 : 1z= 14 (or \z = l) 

The system is now in triangular form, so Part A is completed. 

Part B. The values for the unknowns are obtained in reverse order, z,y,x, by back-substitution. 
Specifically, 

(1) Solve for z in L 3 to get z — 2. 

(2) Substitute z = 2 in L 2 , and solve for y to get y = —3. 

(3) Substitute y = —3 and z = 2 in L x , and solve for x to get x = 1. 

Thus, the solution of the triangular system and hence the original system is as follows: 


x— l, y = -3, z = 2 


u — (1) —3,2). 


or, equivalently 
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Condensed Format 

The Gaussian elimination algorithm involves rewriting systems of linear equations. Sometimes we can 
avoid excessive recopying of some of the equations by adopting a “condensed format.” This format for the 
solution of the above system follows: 


Number 

Equation 


Operation 

(1) 

x — 3y — 2z = 

6 


(2) 

2x — 4y — 3z = 

8 


(3) 

—3x + 6y + 8 z = 

-5 


(2') 

2y + z — 

-4 

Replace L 2 by —2 L x + L 

(3') 

- 3 y + 2 z — 

13 

Replace L 3 by 3L, + L 3 

(3") 

lz — 

14 

Replace L 3 by 3 L 2 + 2 L 3 


That is, first we write down the number of each of the original equations. As we apply the Gaussian elimination 
algorithm to the system, we only write down the new equations, and we label each new equation using the 
same number as the original conesponding equation, but with an added prime. (After each new equation, we 
will indicate, for instructional purposes, the elementary operation that yielded the new equation.) 

The system in triangular form consists of equations (1), (2'), and (3"), the numbers with the largest 
number of primes. Applying back-substitution to these equations again yields x = 1, y — —3, z — 2. 

Remark: If two equations need to be interchanged, say to obtain a nonzero coefficient as a pivot, 
then this is easily accomplished in the format by simply renumbering the two equations rather than 
changing their positions. 

EXAMPLE 3.7 Solve the following system: x + 2y — 3z — 1 

2x + 5y — 8z = 4 
3x + 8y — 13z = 7 

We solve the system by Gaussian elimination. 

Part A. (Forward Elimination) We use the coefficient 1 of x in the first equation L x as the pivot in order to 
eliminate x from the second equation L 2 and from the third equation L 3 . This is accomplished as follows: 

(1) Multiply L x by the multiplier m = —2 and add it to L 2 , that is, “Replace L 2 by — 2L X + L 2 .” 

(2) Multiply L x by the multiplier m = —3 and add it to L 3 \ that is, “Replace L 3 by —3 L x + L 3 .” 

The two steps yield 


x + 2y — 3z = 1 

y -2z = 2 
2y — 4z = 4 


or 


x + 2y — 3z = 1 
y — 2z — 2 


(The third equation is deleted, because it is a multiple of the second equation.) The system is now in echelon form 
with free variable z. 

Part B. (Backward Elimination) To obtain the general solution, let the free variable z = a, and solve for x and y 
by back-substitution. Substitute z = a in the second equation to obtain y = 2 + 2a. Then substitute z = a and 
y = 2 + 2a into the first equation to obtain 

x + 2(2 + 2d) —3a=l or x + 4 + 4a — 3a = 1 or x = —3 — a 
Thus, the following is the general solution where a is a parameter: 
x = —3 — a, y = 2 + 2a, z — a or 


it = (—3 — a, 2 + 2a, a) 
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EXAMPLE 3.8 Solve the following system: 

x l + 3x 2 — 2x 3 + 5.v.| = 4 
2xj + 8x 2 — x 3 + 9x 4 = 9 
3x ( + 5x 2 — 12x 3 + 17x 4 = 7 

We use Gaussian elimination. 

Part A. (Forward Elimination) We use the coefficient l of x 3 in the first equation L, as the pivot in order to 
eliminate Xj from the second equation L 2 and from the third equation L 3 . This is accomplished by the following 
operations: 

(1) "Replace L 2 by —2Lj + L 2 ” and (2) "Replace L 3 by —3 L x + L 3 ” 

These yield: 

x 1 + 3x 2 — 2x 3 + 5 x 4 = 4 

2x 2 + 3x 3 — x 4 = 1 

— 4x 2 — 6x 3 + 2x 4 = —5 

We now use the coefficient 2 of x 2 in the second equation L 2 as the pivot and the multiplier m = 2 in order to 
eliminate x 2 from the third equation L 3 . This is accomplished by the operation "Replace L 3 by 2 L 2 + L 3 ,” which then 
yields the degenerate equation 

Ox'! + 0x 2 + 0x 3 + 0x 4 = — 3 

This equation and, hence, the original system have no solution: 

DO NOT CONTINUE 

Remark 1: As in the above examples. Part A of Gaussian elimination tells us whether or not the 
system has a solution—that is, whether or not the system is consistent. Accordingly, Part B need never be 
applied when a system has no solution. 

Remark 2: If a system of linear equations has more than four unknowns and four equations, then it 
may be more convenient to use the matrix format for solving the system. This matrix format is discussed 
later. 


3.7 Echelon Matrices, Row Canonical Form, Row Equivalence 


One way to solve a system of linear equations is by working with its augmented matrix M rather than the 
system itself. This section introduces the necessary matrix concepts for such a discussion. These concepts, 
such as echelon matrices and elementary row operations, are also of independent interest. 

Echelon Matrices 

A matrix A is called an echelon matrix, or is said to be in echelon form, if the following two conditions 
hold (where a leading nonzero element of a row of A is the first nonzero element in the row): 

(1) All zero rows, if any, are at the bottom of the matrix. 

(2) Each leading nonzero entry in a row is to the right of the leading nonzero entry in the preceding row. 
That is, A = [aJ is an echelon matrix if there exist nonzero entries 

, ay 2 > • • • > a rj r , where j x <j 2 < -< j r 
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with the property that 


ciy = 0 


for 


f (i )i<r, j<j t 
\ (ii) i > r 


The entries ay , Oy■ , ■■■, ay , which are the leading nonzero elements in their respective rows, are called 
the pivots of the echelon matrix. 


EXAMPLE 3.9 The following is an echelon matrix whose pivots have been circled: 


A = 


0 

0 

0 

0 

0 


© 3 4 5 9 0 7 

0 0 (3) 4 1 2 5 

0 0 0 0 ® 7 2 

0 0 0 0 0 ® 6 

0 0 0 0 0 0 0 


Observe that the pivots are in columns C 2 , C 4 , C 6 , C 7 , and each is to the right of the one above. Using the above 
notation, the pivots are 

a ij { 7, a 2 j 2 3, 5, ^ 

where ,/j = 2, j 2 = 4, j 3 = 6, j A = 7. Here r = 4. 

Row Canonical Form 

A matrix A is said to be in row canonical form (or row-reduced echelon form) if it is an echelon matrix— 
that is, if it satisfies the above properties (1) and (2), and if it satisfies the following additional two 
properties: 

(3) Each pivot (leading nonzero entry) is equal to 1. 

(4) Each pivot is the only nonzero entry in its column. 

The major difference between an echelon matrix and a matrix in row canonical form is that in an 
echelon matrix there must be zeros below the pivots [Properties (1) and (2)], but in a matrix in row 
canonical form, each pivot must also equal 1 [Property (3)] and there must also be zeros above the pivots 
[Property (4)]. 

The zero matrix 0 of any size and the identity matrix I of any size are important special examples of 
matrices in row canonical form. 

EXAMPLE 3.10 

The following are echelon matrices whose pivots have been circled: 


'© 3 2 0 4 5 

0 0 0 © -3 2 

0 0 0 0 0 ® 

0 0 0 0 0 0 



'© 2 3' 


'0 0 3 0 0 4' 

7 

0 0® 

1 

0 0 0 0 0 -3 


O ' 

o 

o 


0 0 0 0 0 2 


The third matrix is also an example of a matrix in row canonical form. The second matrix is not in row canonical 
form, because it does not satisfy property (4); that is, there is a nonzero entry above the second pivot in the third 
column. The first matrix is not in row canonical form, because it satisfies neither property (3) nor property (4); that is, 
some pivots are not equal to 1 and there are nonzero entries above the pivots. 
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Elementary Row Operations 

Suppose A is a matrix with rows R l ,R 2 ,, R,„. The following operations on A are called elementary row 
operations. 

[Ej ] (Row Interchange): Interchange rows R, and R r This may be written as 

“Interchange R, and Rf or “R : <—> R” 

[E 2 ] (Row Scaling): Replace row R, by a nonzero multiple kR, of itself. This may be written as 

“Replace R, by kR t (k f 0)” or “kR, —> Rf 

[E 3 ] (Row Addition): Replace row R ; by the sum of a multiple AR, of a row R, and itself. This may be 
written as 

“Replace R, by AR, + Rf or "kR, + R f —> Rf 
The arrow —> in E 2 and E 3 may be read as “replaces.” 

Sometimes (say to avoid fractions when all the given scalars are integers) we may apply [E 2 ] and [E 3 ] in 
one step; that is, we may apply the following operation: 

[E] Replace R ; by the sum of a multiple AR, of a row R, and a nonzero multiple k'R, of itself. This may be 
written as 


“Replace R y by kR, + k’Rj (k / f 0)” or “kR, + k'R } Rf 
We emphasize that in operations [E 3 ] and [E] only row R ; is changed. 

Row Equivalence, Rank of a Matrix 

A matrix A is said to be row equivalent to a matrix B, written 
A ~ B 

if B can be obtained from A by a sequence of elementary row operations. In the case that B is also an 
echelon matrix, B is called an echelon form of A. 

The following are two basic results on row equivalence. 

THEOREM 3.7: Suppose A = [af and B = [by] are row equivalent echelon matrices with respective 
pivot entries 

a \p a y 2 ,■■■ a rj r and b \k, ,b 2kl ,... b sK 

Then A and B have the same number of nonzero rows—that is, r = s —and the pivot 
entries are in the same positions—that is, j\ = k h j 2 = k 2 , ..., j r — Ay. 

THEOREM 3.8: Every matrix A is row equivalent to a unique matrix in row canonical form. 

The proofs of the above theorems will be postponed to Chapter 4. The unique matrix in Theorem 3.8 is 
called the row canonical form of A. 

Using the above theorems, we can now give our first definition of the rank of a matrix. 

DEFINITION: The rank of a matrix A, written rank (A), is equal to the number of pivots in an echelon 
form of A. 

The rank is a very important property of a matrix and, depending on the context in which the 
matrix is used, it will be defined in many different ways. Of course, all the definitions lead to the 
same number. 


The next section gives the matrix format of Gaussian elimination, which finds an echelon form of any 
matrix A (and hence the rank of A), and also finds the row canonical form of A. 
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One can show that row equivalence is an equivalence relation. That is, 

(1) A ~ A for any matrix A. 

(2) If A~R, then B ~ A. 

(3) If A ~ B and B ~ C, then A ~ C. 

Property (2) comes from the fact that each elementary row operation has an inverse operation of the same 
type. Namely, 

(i) “Interchange R, and R” is its own inverse. 

(ii) “Replace R, by and “Replace R, by (1 /k)R” are inverses. 

(iii) “Replace R ; by kR, + R” and “Replace R ( by -kR t + R” are inverses. 

There is a similar result for operation [E] (Problem 3.73). 


3.8 Gaussian Elimination, Matrix Formulation 


This section gives two matrix algorithms that accomplish the following: 

(1) Algorithm 3.3 transforms any matrix A into an echelon form. 

(2) Algorithm 3.4 transforms the echelon matrix into its row canonical form. 

These algorithms, which use the elementary row operations, are simply restatements of Gaussian 
elimination as applied to matrices rather than to linear equations. (The term “row reduce” or simply 
“reduce” will mean to transform a matrix by the elementary row operations.) 

ALGORITHM 3.3 (Forward Elimination): The input is any matrix A. (The algorithm puts 0’s below 

each pivot, working from the “top-down.”) The output is 
an echelon form of A. 

Step 1. Find the first column with a nonzero entry. Let /, denote this column. 

(a) Arrange so that a lh =/= 0. That is, if necessary, interchange rows so that a nonzero entry 
appears in the first row in column j x . 

(b) Use a Xh as a pivot to obtain 0's below a, /| . 

Specifically, for i > 1: 

(1) Set m — — aijjay ; (2) Replace R, by mR , +R,- 

[That is, apply the operation —(a j j i /a l j i )R l +R, —*R,.] 

Step 2. Repeat Step 1 with the submatrix formed by all the rows excluding the first row. Here we let j 2 
denote the first column in the subsystem with a nonzero entry. Hence, at the end of Step 2, we 
have a 2 j 2 7 ^ 0 . 

Steps 3 to r. Continue the above process until a submatrix has only zero rows. 

We emphasize that at the end of the algorithm, the pivots will be 

a l/l ’ a 2 ; 2 > a rj r 

where r denotes the number of nonzero rows in the final echelon matrix. 

Remark 1: The following number m in Step 1(b) is called the multiplier : 

dy l entry to be deleted 

a \j { Pivot 


m = — 
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Remark 2: One could replace the operation in Step 1(b) by the following which would avoid 
fractions if all the scalars were originally integers. 

Replace R t by ~a v R y + a Xj R t . 


ALGORITHM 3.4 (Backward Elimination): The input is a matrix A = [aJ in echelon form with pivot 

entries 

a l/i’ a 2:j 2 ’ •••> °rj r 

The output is the row canonical form of A. 

Step 1. (a) (Use row scaling so the last pivot equals 1.) Multiply the last nonzero row R r by 1 /a •. 
(b) (Use a rj = 1 to obtain 0's above the pivot.) For i — r — 1, r — 2, ..., 2, 1: 

(1) Set m — —a t j r \ (2) Replace A, by mR r + R, 

(That is, apply the operations —a if R r + A, —»■ A r ) 

Steps 2 to r—1. Repeat Step 1 for rows R r _,, R r _ 2 , ■ ■ ., A 2 . 

Step r. (Use row scaling so the first pivot equals 1.) Multiply R x by 1 /a^ . 

There is an alternative form of Algorithm 3.4, which we describe here in words. The formal 
description of this algorithm is left to the reader as a supplementary problem. 

ALTERNATIVE ALGORITHM 3.4 Puts 0’s above the pivots row by row from the bottom up (rather 

than column by column from right to left). 

The alternative algorithm, when applied to an augmented matrix M of a system of linear equations, is 
essentially the same as solving for the pivot unknowns one after the other from the bottom up. 


Remark: We emphasize that Gaussian elimination is a two-stage process. Specifically, 

Stage A (Algorithm 3.3). Puts 0’s below each pivot, working from the top row A, down. 

Stage B (Algorithm 3.4). Puts 0's above each pivot, working from the bottom row R r up. 

There is another algorithm, called Gauss-Jordcin, that also row reduces a matrix to its row canonical form. 
The difference is that Gauss-Jordan puts 0’s both below and above each pivot as it works its way from the 
top row R ] down. Although Gauss-Jordan may be easier to state and understand, it is much less efficient 
than the two-stage Gaussian elimination algorithm. 


EXAMPLE 3.11 Consider the matrix A = 


1 2 

2 4 

3 6 


-3 1 2 

-4 6 10 

-6 9 13 


(a) Use Algorithm 3.3 to reduce A to an echelon form. 


(b) Use Algorithm 3.4 to further reduce A to its row canonical form. 


(a) First use a n = 1 as a pivot to obtain 0’s below a u ; that is, apply the operations "Replace R> by —2/?! + R>" and 
"Replace by —3R l +R 3 .” Then use a 2 3 = 2 as a pivot to obtain 0 below a 23 ; that is, apply the operation 
"Replace R 3 by — |R 2 + A 3 .” This yields 


"1 

2 

-3 

1 

2" 


"1 

2 

-3 

1 

2" 

0 

0 

2 

4 

6 


0 

0 

2 

4 

6 

0 

0 

3 

6 

7 


0 

0 

0 

0 

-2 


The matrix is now in echelon form. 
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(b) Multiply R 3 by — j so the pivot entry a 35 = 1, and then use a 35 = 1 as a pivot to obtain 0’s above it by the 
operations "Replace R 2 by —6R. + AS” and then “Replace R t by —2 R 3 + R l .” This yields 


'1 

2 

-3 

1 

2' 


'1 

2 

-3 

1 

O' 

0 

0 

2 

4 

6 

rs_i 

0 

0 

2 

4 

0 

0 

0 

0 

0 

1 


0 

0 

0 

0 

1 


Multiply R 2 by * so the pivot entry a 23 = 1, and then use a 23 = 1 as a pivot to obtain 0’s above it by the operation 
“Replace R l by 3 R 2 + R l .” This yields 



"1 

2 

-3 

1 

O' 


'1 

2 

0 

7 

O' 

A ~ 

0 

0 

1 

2 

0 

r-^j 

0 

0 

1 

2 

0 


0 

0 

0 

0 

1 


0 

0 

0 

0 

1 


The last matrix is the row canonical form of A. 


Application to Systems of Linear Equations 

One way to solve a system of linear equations is by working with its augmented matrix M rather than the 
equations themselves. Specifically, we reduce M to echelon form (which tells us whether the system has a 
solution), and then further reduce M to its row canonical form (which essentially gives the solution of the 
original system of linear equations). The justification for this process comes from the following facts: 

(1) Any elementary row operation on the augmented matrix M of the system is equivalent to applying the 
corresponding operation on the system itself. 

(2) The system has a solution if and only if the echelon form of the augmented matrix M does not have a 
row of the form (0,0,..., 0, b ) with b ^ 0. 

(3) In the row canonical form of the augmented matrix M (excluding zero rows), the coefficient of each 
basic variable is a pivot entry equal to 1, and it is the only nonzero entry in its respective column; 
hence, the free-variable form of the solution of the system of linear equations is obtained by simply 
transferring the free variables to the other side. 

This process is illustrated below. 

EXAMPLE 3.12 Solve each of the following systems: 

x 1 + x 2 — 2x 3 + 4x 4 = 5 Xi + x 2 — 2x 3 + 3x 4 = 4 x + 2y + z — 3 

2xj + 2x 2 — 3x 3 + x 4 = 3 2xq + 3x 2 + 3x 3 — x 4 = 3 2x + 5y — z = —4 

3x| + 3x 2 — 4x 3 — 2x 4 = 1 5xq + 7x 2 + 4x 3 + x 4 = 5 3x — 2 y — z— 5 

(a) (b) (c) 

(a) Reduce its augmented matrix M to echelon form and then to row canonical form as follows: 



'1 

1 

-2 

4 

5' 


'1 

1 

-2 

4 

5' 


‘1 

1 

0 

-10 

-9' 

M = 

2 

2 

-3 

1 

3 

r-j 

0 

0 

1 

-7 

-7 

r-j 

0 

0 

1 

-7 

-7 


3 

3 

-4 

-2 

1 


0 

0 

2 

-14 

-14 


0 

0 

0 

0 

0 


Rewrite the row canonical form in terms of a system of linear equations to obtain the free variable form of the 
solution. That is, 

x, + x 2 — 10x 4 = —9 x 1 = — 9 — x 2 + 10x 4 

x 3 — 7x 4 — —7 x 3 = —7 T 7x 4 

(The zero row is omitted in the solution.) Observe that x t and .v 3 are the pivot variables, and x 2 and x 4 are the free 
variables. 
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(b) First reduce its augmented matrix M to echelon form as follows: 



“1 

1 

-2 

3 

4' 


“1 

1 

-2 

3 

4' 


"1 

1 

-2 

3 

4' 

M = 

2 

3 

3 

-1 

3 


0 

1 

7 

-7 

-5 

r-j 

0 

1 

7 

-7 

-5 


5 

7 

4 

1 

5 


0 

2 

14 

-14 

-15 


0 

0 

0 

0 

-5 


There is no need to continue to find the row canonical form of M, because the echelon form already tells us that 
the system has no solution. Specifically, the third row of the echelon matrix corresponds to the degenerate 
equation 


Ox | T O.tx : 0x 3 T O.V 4 — —5 

which has no solution. Thus, the system has no solution. 

(c) Reduce its augmented matrix M to echelon form and then to row canonical form as follows: 



‘1 

2 

1 

3' 


'1 


2 

1 

3' 




1 

2 


1 

3' 

M = 

2 

5 

-1 

-4 

r-j 

0 


1 

-3 

-10 

r-j 



0 

1 

- 

-3 

-10 


_3 

-2 

-1 

5_ 


_0 

- 

-8 

-4 

—4_ 




0 

0 

-28 

— 84_ 


‘1 

2 

1 

3' 


"1 

2 

0 

O' 


'1 

0 

0 


2 ' 




0 

1 

-3 

-10 


0 

1 

0 

-1 


0 

1 

0 


-1 




_0 

0 

1 

3_ 


0 

0 

1 

3_ 


0 

0 

1 


3_ 




Thus, the system has the unique solution x = 2, y = — 1, z = 3, or, equivalently, the vector u = (2,-1,3). We 
note that the echelon form of M already indicated that the solution was unique, because it corresponded to a 
triangular system. 


Application to Existence and Uniqueness Theorems 

This subsection gives theoretical conditions for the existence and uniqueness of a solution of a system of 
linear equations using the notion of the rank of a matrix. 

THEOREM 3.9: Consider a system of linear equations in n unknowns with augmented matrix 
M=[A,B\. Then, 

(a) The system has a solution if and only if rank (A) = rank (M). 

(b) The solution is unique if and only if rank(A) = rank ( /If ) = n. 

Proof of (a). The system has a solution if and only if an echelon form of M = [A. B\ does not have a row 
of the form 


( 0 , 0 ,..., 0 ,b), with bfO 

If an echelon form of M does have such a row, then b is a pivot of M but not of A, and hence, 
rank(M) > rank(A). Otherwise, the echelon forms of A and M have the same pivots, and hence, 
rank(A) = rank(M). This proves (a). 

Proof of (b). The system has a unique solution if and only if an echelon form has no free variable . This 
means there is a pivot for each unknown. Accordingly, n — rank(A) = rank(M). This proves (b). 

The above proof uses the fact (Problem 3.74) that an echelon form of the augmented matrix M = [A,B] 
also automatically yields an echelon form of A. 
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3.9 Matrix Equation of a System of Linear Equations 


The general system (3.2) of m linear equations in n unknowns is equivalent to the matrix equation 


’ a n 

a \2 

■ ■ a \n ’ 


' X\ " 

x 2 

*3 



a 2l 

a 22 

• • a 2n 


— 

b 2 

_ ®m\ 

®m2 

■ * ®mn _ 


. x„ _ 


_ b m _ 


or AX = B 


where A = [a«] is the coefficient matrix, X = [x.] is the column vector of unknowns, and B = [Z?,] is the 
column vector of constants. (Some texts write Ax = b rather than AX = B, in order to emphasize that x and 
b are simply column vectors.) 

The statement that the system of linear equations and the matrix equation are equivalent means that any 
vector solution of the system is a solution of the matrix equation, and vice versa. 


EXAMPLE 3.13 The following system of linear equations and matrix equation are equivalent: 


x l + 2 x 2 — 4 x 3 + 7x 4 — 4 
3 x| — 5 x 2 + 6x 3 — 8x4 = 8 and 

4x| — 3 x 2 — 2x 3 + 6 x 4 = 11 


1 2-4 T 


Xi 


4' 

3-5 6-8 


*2 

— 

8 

4-3-2 6 


x 3 


11 



x 4 




We note that x l = 3, x 2 = 1, x 3 = 2, x 4 = 1, or, in other words, the vector u = [3,1,2,1] is a solution of the 
system. Thus, the (column) vector u is also a solution of the matrix equation. 


The matrix form AX = B of a system of linear equations is notationally very convenient when 
discussing and proving properties of systems of linear equations. This is illustrated with our first theorem 
(described in Fig. 3-1), which we restate for easy reference. 

THEOREM 3.10: Suppose the field K is infinite. Then the system AX' = B has: (a) a unique solution, (b) 
no solution, or (c) an infinite number of solutions. 


Proof. It suffices to show that if AX = B has more than one solution, then it has infinitely many. 
Suppose u and v are distinct solutions of AX = B: that is. An = B and A v = B. Then, for any k 6 K, 

A[u + k(u — w)] = Au + k(Au — Av ) = B + k(B — B) = B 

Thus, for each k G K, the vector u + k(u — v) is a solution of AX = B. Because all such solutions are 
distinct (Problem 3.47), AX = B has an infinite number of solutions. 

Observe that the above theorem is true when K is the real field R (or the complex field C). Section 3.3 
shows that the theorem has a geometrical description when the system consists of two equations in two 
unknowns, where each equation represents a line in R 2 . The theorem also has a geometrical description 
when the system consists of three nondegenerate equations in three unknowns, where the three equations 
correspond to planes H x , H 2 , H 3 in R 3 . That is, 

(a) Unique solution: Here the three planes intersect in exactly one point. 

(b) No solution: Here the planes may intersect pairwise but with no common point of intersection, or two 
of the planes may be parallel. 

(c) Infinite number of solutions: Here the three planes may intersect in a line (one free variable), or they 
may coincide (two free variables). 

These three cases are pictured in Fig. 3-3. 


Matrix Equation of a Square System of Linear Equations 

A system AX = B of linear equations is square if and only if the matrix A of coefficients is square. In such a 
case, we have the following important result. 
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(b) No solutions 


Figure 3-3 

THEOREM 3.10: A square system AX — B of linear equations has a unique solution if and only if the 
matrix A is invertible. In such a case, A~ l B is the unique solution of the system. 

We only prove here that if A is invertible, then A 1 B is a unique solution. If A is invertible, then 

A{A~ l B) = {AA~ l )B = IB = B 

and hence, A~ l B is a solution. Now suppose v is any solution, so Av = B. Then 
v = Iv = (A~ l A)v — A~ l (Av) = A~ l B 
Thus, the solution /I 1 B is unique. 

EXAMPLE 3.14 Consider the following system of linear equations, whose coefficient matrix A and 
inverse A -1 are also given: 


x + 2y + 3z — 1 


'1 

2 

3' 


3 

-8 

3' 

x + 3y + 6z = 3 , 

A = 

1 

3 

6 

, A " 1 = 

-1 

7 

-3 

2x + 6y + 13 z = 5 


2 

6 

13 


0 

-2 

1 


By Theorem 3.10, the unique solution of the system is 


'3-8 3' 


V 


- 6 ' 

m 

1 

7 


3 

= 

5 

0 -2 1 


5 


-1 


That is, x = — 6 , y = 5, z = — 1 ■ 

Remark: We emphasize that Theorem 3.10 does not usually help us to find the solution of a square 
system. That is, finding the inverse of a coefficient matrix A is not usually any easier than solving the 
system directly. Thus, unless we are given the inverse of a coefficient matrix A, as in Example 3.14, 
we usually solve a square system by Gaussian elimination (or some iterative method whose discussion lies 
beyond the scope of this text). 
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3.10 Systems of Linear Equations and Linear Combinations of Vectors 

The general system (3.2) of linear equations may be rewritten as the following vector equation: 



«11 


a l2 


«1 n 


b\ 

*1 

«21 

+ x 2 

a 22 

H - F x n 


— 

b 2 


_^m\ _ 


_ a m2 _ 


_ ^mn _ 


_ _ 


Recall that a vector v in K n is said to be a linear combination of vectors u ,. u 2 ,.... u m in K" if there exist 
scalars a { ,a 2 ,,a m in K such that 

v = a l u l + a 2 u 2 + ■ ■ ■ + a m u m 

Accordingly, the general system (3.2) of linear equations and the above equivalent vector equation have a 
solution if and only if the column vector of constants is a linear combination of the columns of the 
coefficient matrix. We state this observation formally. 

THEOREM 3.11: A system AX = B of linear equations has a solution if and only if B is a linear 
combination of the columns of the coefficient matrix A. 

Thus, the answer to the problem of expressing a given vector v in K" as a linear combination of vectors 
, u 2 ,..., u m in K" reduces to solving a system of linear equations. 

Linear Combination Example 

Suppose we want to write the vector v — (1, —2,5) as a linear combination of the vectors 

Ml = (1,1,1), u 2 = (1,2,3), «3 = (2,-1,1) 

First we write v = xu t + yu 2 + zu 2 with unknowns x, y, z, and then we find the equivalent system of linear 
equations which we solve. Specifically, we first write 


r 


T 


T 


2" 

-2 

= X 

l 

+ .v 

2 

+ z 

-1 

5 


l 


3 


1 


Then 


r 




y 


2 z 


x + y + 2z 

-2 

= 

X 

+ 

2 y 

+ 

-z 

— 

x + 2 y — z 

5 




ly. 


z_ 


_x. + 3 y + z_ 


Setting corresponding entries equal to each other yields the following equivalent system: 

x+ y + 2z= 1 

x + 2 y— z — —2 (**) 

x+3 y+z— 5 

For notational convenience, we have written the vectors in R" as columns, because it is then easier to find 
the equivalent system of linear equations. In fact, one can easily go from the vector equation (*) directly to 
the system (**). 

Now we solve the equivalent system of linear equations by reducing the system to echelon form. This 
yields 


x + y+2z= 1 

y — 3z = — 3 and then 

2 y- z= 4 

Back-substitution yields the solution x 


x + y + 2z — 1 

y — 3z — —3 
5z= 10 

—6, y = 3, z = 2. Thus, v = —6u l + 3 u 2 + 2m 3 . 
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EXAMPLE 3.15 

(a) Write the vector v = (4, 9,19) as a linear combination of 

Mi = (1,-2,3), u 2 = (3,-7,10), u 3 = ( 2,1,9). 

Find the equivalent system of linear equations by writing v = xu t + yu 2 + zu 2 , and reduce the system to an 
echelon form. We have 

x + 3y + 2z = 4 x +3y + 2z = 4 x + 3y + 2z = 4 

— 2x — ly + z — 9 or — y + 5z = 17 or — y + 5r = 17 

3x + lOy + 9z = 19 y + 3z = 7 8r = 24 

Back-substitution yields the solution x = 4, y = — 2, z = 3. Thus, v is a linear combination of u ll u 2 ,u 2 . 
Specifically, v = 4u t — lu 2 + 3 u 3 . 

(b) Write the vector v = (2, 3, —5) as a linear combination of 

Mt = (1,2, —3), u 2 — (2,3, —4), m 3 = (1,3, —5) 

Find the equivalent system of linear equations by writing v = xu x + yu 2 + zu 2 , and reduce the system to an 
echelon form. We have 

x + 2y + z = 2 x + 2y + z = 2 x + 2y + z = 2 

2x + 3y + 3z = 3 or —y+z=— 1 or — 5 y + 5r = — 1 

—3x — 4y — 5z — —5 2y — 2z = 1 0—3 

The system has no solution. Thus, it is impossible to write v as a linear combination of u l ,u 2 ,u 2 . 

Linear Combinations of Orthogonal Vectors, Fourier Coefficients 

Recall first (Section 1.4) that the dot (inner) product u ■ v of vectors it = (a,...., a n ) and v — (b u .... b n ) 
in R" is defined by 

u ■ v — Q\b\ -f- a 2 b 2 + ■ ■ ■ + a n b n 

Furthermore, vectors u and v are said to be orthogonal if their dot product u ■ v — 0. 

Suppose that u x ,u 2 ,... ,u n in R" are n nonzero pairwise orthogonal vectors. This means 

(i) m, ■ Uj = 0 for i 7 ^ j and (ii) u i ■ u t ^ 0 for each i 

Then, for any vector v in R", there is an easy way to write v as a linear combination of u l ,u 2 ,... ,u n , 
which is illustrated in the next example. 

EXAMPLE 3.16 Consider the following three vectors in R 3 : 

Mi = (1,1,1), m 2 = (1, —3,2), m 3 = (5, —1, —4) 

These vectors are pairwise orthogonal; that is, 

u x ■ u 2 = 1 — 3 + 2 = 0, «! -m 3 = 5— 1— 4 = 0, w 2 ■ « 3 = 5 + 3 — 8 — 0 

Suppose we want to write v = (4,14, —9) as a linear combination of u t , u 2 , u 3 . 

Method 1. Find the equivalent system of linear equations as in Example 3.14 and then solve, 
obtaining v — 3 — 4m 2 + n 3 . 

Method 2. (This method uses the fact that the vectors u ,, u 2 . m 3 are mutually orthogonal, and hence, 
the arithmetic is much simpler.) Set v as a linear combination of u ,. n 2 , n 3 using unknown scalars x, y, z as 
follows: 

(4,14, -9) = x(l, 1,1) +y( 1, -3,2) +z(5, -1, -4) (*) 
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Take the dot product of (*) with respect to u l to get 

(4,14, —9) • (1,1,1) = x(l, 1,1) • (1.1.1) or 9 = 3x or x=3 

(The last two terms drop out, because u l is orthogonal to u 2 and to m 3 .) Next take the dot product of (*) with respect to 
u 2 to obtain 

(4,14, —9) • (1, —3, 2) = y(l, —3,2) ■ (1, —3, 2) or - 56 = 14y or y =-4 
Finally, take the dot product of (*) with respect to u 3 to get 

(4,14, -9) -(5,-1,-4)= z(5,-1,-4) -(5,-1,-4) or 42 = 42z or z=l 
Thus, v = 3u l — 4 u 2 + u 3 . 

The procedure in Method 2 in Example 3.16 is valid in general. Namely, 


THEOREM 3.12: Suppose u l ,u 2 , ■ ■ ■ ■ u n are nonzero mutually orthogonal vectors in R". Then, for any 
vector v in R", 

v ■ u x v ■ u 2 v ' u n 

v — - u x H- u 2 + ■ ■ ■ H- u n 

M] ■ U | U 2 ■ U 2 u n ■ U n 

We emphasize that there must be n such orthogonal vectors w, in R" for the formula to be used. Note 
also that each u t ■ u, 0 , because each w, is a nonzero vector. 


Remark: The following scalar k t (appearing in Theorem 3.12) is called the Fourier coefficient of v 
with respect to uf. 


V ■ Uj 
Uj ■ Uj 


V ■ Uj 



It is analogous to a coefficient in the celebrated Fourier series of a function. 


3.11 Homogeneous Systems of Linear Equations 


A system of linear equations is said to be homogeneous if all the constant terms are zero. Thus, a 
homogeneous system has the form AX — 0. Clearly, such a system always has the zero vector 
0 = (0,0,..., 0) as a solution, called the zero or trivial solution. Accordingly, we are usually interested 
in whether or not the system has a nonzero solution. 

Because a homogeneous system AX = 0 has at least the zero solution, it can always be put in an echelon 
form, say 

+ a x2 x 2 + ^^3X3 + #14X4 + • • • + a Xn x n — 0 
a 2j 2 x j 2 + a 2j 2 +l x j 2 +l + ■ ■ ■ + a 2 A = 0 


a rj r X J r + ' ' ' + a mX„ = 0 

Here r denotes the number of equations in echelon form and n denotes the number of unknowns. Thus, the 
echelon system has n — r free variables. 

The question of nonzero solutions reduces to the following two cases: 

(i) r = n. The system has only the zero solution. 

(ii) r < n. The system has a nonzero solution. 

Accordingly, if we begin with fewer equations than unknowns, then, in echelon form, r < n. and the 
system has a nonzero solution. This proves the following important result. 

THEOREM 3.13: A homogeneous system AX = 0 with more unknowns than equations has a nonzero 
solution. 
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EXAMPLE 3.17 Determine whether or not each of the following homogeneous systems has a nonzero 
solution: 


x + y — z = 0 
2x — 3y + z = 0 
x — 4y + 2z — 0 

(a) 


x + v z = 0 
2x + 4y — z = 0 
3x + 2y + 2z = 0 

(b) 


Xj + 2 x 2 — 3x 3 + 4x 4 = 0 
2x[ — 3 x 2 + 5x 3 — 7x 4 = 0 
5x[ + 6 x 2 — 9x 3 + 8x 4 = 0 
(c) 


(a) Reduce the system to echelon form as follows: 


x + y — z — 0 
—5y + 3z = 0 
—5y + 3z = 0 


and then 


x + y - z = 0 
—5y + 3z — 0 


The system has a nonzero solution, because there are only two equations in the three unknowns in echelon form. 
Here z is a free variable. Let us, say, set z = 5. Then, by back-substitution, y = 3 and x = 2. Thus, the vector 
u = (2,3,5) is a particular nonzero solution. 


(b) Reduce the system to echelon form as follows: 


x + y — z = 0 x -F y — z = 0 

2 y + z = 0 and then 2 y + z = 0 

—y + 5z — 0 1 lz = 0 


In echelon form, there are three equations in three unknowns. Thus, the system has only the zero solution. 

(c) The system must have a nonzero solution (Theorem 3.13), because there are four unknowns but only three 
equations. (Here we do not need to reduce the system to echelon form.) 


Basis for the General Solution of a Homogeneous System 

Let W denote the general solution of a homogeneous system AX = 0. A list of nonzero solution vectors 
, n 2 ,..., u s of the system is said to be a basis for W if each solution vector w £ W can be expressed 
uniquely as a linear combination of the vectors iq, u 2 , ■ ■ ■, u s ; that is, there exist unique scalars 
, a 2 ,..., a s such that 

vv = a l u l + a 2 u 2 + • • • + a s u s 

The number s of such basis vectors is equal to the number of free variables. This number s is called the 
dimension of W, written as dim IT = .v. When W = {0}—that is, the system has only the zero solution— 
we define dim W = 0. 

The following theorem, proved in Chapter 5, page 171, tells us how to find such a basis. 

THEOREM 3.14: Let W be the general solution of a homogeneous system AX = 0, and suppose that the 
echelon form of the homogeneous system has .v free variables. Let u l , u 2 ,. .., u s be the 
solutions obtained by setting one of the free variables equal to 1 (or any nonzero 
constant) and the remaining free variables equal to 0. Then dim W = s, and the 
vectors u l , u 2 ,..., u s form a basis of W. 

We emphasize that the general solution W may have many bases, and that Theorem 3.12 only gives us 
one such basis. 

EXAMPLE 3.18 Find the dimension and a basis for the general solution W of the homogeneous system 

X] + 2x 2 — 3x 3 + 2x 4 — 4x 5 = 0 

2x] + 4x 2 — 5x 3 + x 4 — 6x 5 = 0 

5x| + 10x 2 — 13x 3 + 4x 4 — 16x 5 = 0 
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First reduce the system to echelon form. Apply the following operations: 

“Replace L 2 by —2L, + L 2 and “Replace L 3 by — 5L, + L 3 ” and then “Replace L 3 by —2 L 2 + L-” 


These operations yield 


Xj + 2x 2 — 3x 3 + 2x 4 — 4x 5 = 0 
x 3 — 3x 4 + 2x 5 = 0 
2x 3 — 6x 4 + 4x 5 = 0 


and 


X| + 2x 2 — 3x 3 + 2x 4 — 4x 5 = 0 
x 3 — 3x 4 + 2x 5 = 0 


The system in echelon form has three free variables, x 2 ,x 4 ,x 5 ; hence, dim W = 3. Three solution vectors that form a 
basis for W are obtained as follows: 

(1) Set x 2 = 1, x 4 = 0, x 5 = 0. Back-substitution yields the solution u x = (—2,1,0,0,0). 

(2) Set x 2 = 0, x 4 = 1, x 5 = 0. Back-substitution yields the solution t< 2 = (7,0, 3,1,0). 

(3) Set x 2 = 0, x 4 = 0, x 5 = 1. Back-substitution yields the solution w 3 = (—2,0, —2,0, 1). 

The vectors iq = (—2,1,0,0,0), « 2 = (7,0,3,1,0), n 3 = (—2,0, —2,0, 1) form a basis for W. 

Remark: Any solution of the system in Example 3.18 can be written in the form 

anj + bu 2 + cn 3 = n(—2, 1 , 0 , 0 , 0 ) + b(l. 0 ,3, 1 , 0 ) + c(—2 , 0 , —2, 0 , 1 ) 

= {—2a + lb — 2c, a, 3b —2c, b, c ) 

or 

X, = —2a + lb — 2c, x 2 = a, x 3 — 3b — 2c, x 4 — b, x 5 — c 

where a,b,c are arbitrary constants. Observe that this representation is nothing more than the parametric 
form of the general solution under the choice of parameters x 2 = a, x 4 = b, x 5 = c. 


Nonhomogeneous and Associated Homogeneous Systems 

Let AX = B be a nonhomogeneous system of linear equations. Then AX = 0 is called the associated 
homogeneous system. For example, 

x T 2y — 4z — 7 x + 2y — 4z — 0 

3x - 5y + 6z = 8 and 3x - 5y + 6 z = 0 

show a nonhomogeneous system and its associated homogeneous system. 

The relationship between the solution U of a nonhomogeneous system AX — B and the solution W of its 
associated homogeneous system AX = 0 is contained in the following theorem. 

THEOREM 3.15: Let v 0 be a particular solution of AX = B and let W be the general solution of AX = 0. 
Then the following is the general solution of AX = B: 

17=n 0 + W / ={n 0 + w:wG IT} 

That is, U = v 0 + W is obtained by adding v 0 to each element in W. We note that this theorem has a 
geometrical interpretation in R 3 . Specifically, suppose W is a line through the origin O. Then, as pictured 
in Fig. 3-4, U = v 0 + W is the line parallel to W obtained by adding v {) to each element of W. Similarly, 
whenever IT is a plane through the origin O, then U = v 0 + W is a plane parallel to IT. 
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3.12 Elementary Matrices 

Let e denote an elementary row operation and let e(A) denote the results of applying the operation e to a 
matrix A. Now let E be the matrix obtained by applying e to the identity matrix 7; that is, 

E = e(I) 

Then E is called the elementary matrix corresponding to the elementary row operation e. Note that E is 
always a square matrix. 

EXAMPLE 3.19 Consider the following three elementary row operations: 

(1) Interchange R 2 and R 2 . (2) Replace R 2 by —6R 2 . (3) Replace 7? 3 by — AR X + R 3 . 

The 3x3 elementary matrices corresponding to the above elementary row operations are as follows: 



l 

0 

o' 


' 1 

0 

0" 


1 

0 

o' 

£, - 

0 

0 

1 

, £2 = 

0 

-6 

0 

£ 3 - 

0 

1 

0 


0 

1 

0 


0 

0 

1 


-4 

0 

1 


The following theorem, proved in Problem 3.34, holds. 

THEOREM 3.16: Let e be an elementary row operation and let E be the corresponding m x m 
elementary matrix. Then 

e{A) = EA 

where A is any m x n matrix. 

In other words, the result of applying an elementary row operation e to a matrix A can be obtained by 
premultiplying A by the corresponding elementary matrix E. 

Now suppose e 1 is the inverse of an elementary row operation e, and let E' and E be the corresponding 
matrices. We note (Problem 3.33) that E is invertible and E' is its inverse. This means, in particular, that 
any product 

P = E k ...E 2 E l 

of elementary matrices is invertible. 
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Applications of Elementary Matrices 

Using Theorem 3.16, we are able to prove (Problem 3.35) the following important properties of matrices. 

THEOREM 3.17: Let A be a square matrix. Then the following are equivalent: 

(a) A is invertible (nonsingular). 

(b) A is row equivalent to the identity matrix I. 

(c) A is a product of elementary matrices. 

Recall that square matrices A and B are inverses if AB = BA — I. The next theorem (proved in Problem 
3.36) demonstrates that we need only show that one of the products is true, say AB = /, to prove that 
matrices are inverses. 

THEOREM 3.18: Suppose AB = I. Then BA = I, and hence, B = A -1 . 

Row equivalence can also be defined in terms of matrix multiplication. Specifically, we will prove 
(Problem 3.37) the following. 

THEOREM 3.19: B is row equivalent to A if and only if there exists a nonsingular matrix P such that 
B = PA. 

Application to Finding the Inverse of an n x n Matrix 

The following algorithm finds the inverse of a matrix. 

ALGORITHM 3.5: The input is a square matrix A. The output is the inverse of A or that the inverse does 
not exist. 

Step 1. Form the n x 2 n (block) matrix M = [A, I], where A is the left half of M and the identity matrix I 
is the right half of M. 

Step 2. Row reduce M to echelon form. If the process generates a zero row in the A half of M, then 

STOP 

A has no inverse. (Otherwise A is in triangular form.) 

Step 3. Further row reduce M to its row canonical form 

M ~ [I,B\ 

where the identity matrix / has replaced A in the left half of M. 

Step 4. Set A 1 = B. the matrix that is now in the right half of M. 

The justification for the above algorithm is as follows. Suppose A is invertible and, say, the sequence 
of elementary row operations e l ,e 2 , ■ ■ ■ ,e q applied to M = [A, I] reduces the left half of M, which is A, to 
the identity matrix 1. Let E i be the elementary matrix corresponding to the operation e r Then, by applying 
Theorem 3.16. we get 

E q ■ ■ ■ E 2 E x A — I or (E q ... E 2 EJ)A — /, so A -1 — E q .. ,E 2 E X I 

That is. A -1 can be obtained by applying the elementary row operations e l ,e 2 , ■ ■ ■ ,e q to the identity 
matrix 7, which appears in the right half of M. Thus, B — A -1 , as claimed. 

EXAMPLE 3.20 


Find the inverse of the matrix A = 


1 0 2 

2-13 
4 1 8 
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First form the (block) matrix M = [A,/] and row reduce M to an echelon form: 


'1 

0 

2 1 

1 

0 

o' 


1 

0 

2 

1 1 

0 

O' 


'1 

0 

2 1 

l 

0 

O' 

2 

-1 

3 

0 

1 

0 

r-j 

0 

-1 

-1 

-2 

1 

0 

r-^i 

0 

-1 

-i: 

-2 

1 

0 

4 

1 

8 | 

0 

0 

1 


0 

1 

0 

i _4 

0 

1 


0 

0 

-i | 

-6 

1 

1 


In echelon form, the left half of M is in triangular form; hence, A has an inverse. Next we further row reduce M to its 
row canonical form: 



"1 

0 

0 1 

-11 

2 

2 ' 


'1 

0 

0 ! 

-11 

2 

2 ' 

M ~ 

0 

-1 

o! 

4 

0 

-1 

r-j 

0 

1 

0 1 

-4 

0 

1 


0 

0 

1 1 

6 

-1 

-1 


0 

0 

1 1 

6 

-1 

-1 


The identity matrix is now in the left half of the final matrix; hence, the right half is A '.In other words, 


A - 1 


-11 2 2 
-4 0 1 

6 -1 -1 


Elementary Column Operations 

Now let A be a matrix with columns C l5 C 2 ,..., C„. The following operations on A, analogous to the 
elementary row operations, are called elementary column operations'. 

[Fi ] (Column Interchange): Interchange columns C, and C ; . 

[F 2 ] (Column Scaling): Replace C, by kC t (where k / 0). 

[F 3 ] (Column Addition): Replace Cj by kC, + Cj. 

We may indicate each of the column operations by writing, respectively, 

(1 )C t <->Cj, (2) kC, —> Cj, (3 )(kC i + C J )^C j 

Moreover, each column operation has an inverse operation of the same type, just like the corresponding 
row operation. 

Now let/ denote an elementary column operation, and let F be the matrix obtained by applying / to the 
identity matrix /; that is, 

p=m 

Then F is called the elementary matrix corresponding to the elementary column operation/. Note that F is 
always a square matrix. 

EXAMPLE 3.21 

Consider the following elementary column operations: 

(1) Interchange C 1 and C 3 ; (2) Replace C 3 by — 2C 3 ; (3) Replace C 3 by —3C 2 + C 3 

The corresponding three 3x3 elementary matrices are as follows: 



‘0 

0 

r 


'1 

0 

o' 


'1 

0 

o' 

Ft = 

0 

1 

0 

F 2 = 

0 

1 

0 

^3 = 

0 

1 

-3 


1 

0 

0 


0 

0 

-2 


0 

0 

1 


The following theorem is analogous to Theorem 3.16 for the elementary row operations. 
THEOREM 3.20: For any matrix A, /(A) = AF. 

That is, the result of applying an elementary column operation / on a matrix A can be obtained by 
postmultiplying A by the corresponding elementary matrix F. 
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Matrix Equivalence 

A matrix B is equivalent to a matrix A if B can be obtained from A by a sequence of row and column 
operations. Alternatively, B is equivalent to A, if there exist nonsingular matrices P and Q such that 
B = PAQ. Just like row equivalence, equivalence of matrices is an equivalence relation. 

The main result of this subsection (proved in Problem 3.38) is as follows. 

THEOREM 3.21: Every m x n matrix A is equivalent to a unique block matrix of the form 

Ir O' 

0 0 

where /,. is the r-square identity matrix. 

The following definition applies. 

DEFINITION: The nonnegative integer r in Theorem 3.21 is called the rank of A, written rank(A). 
Note that this definition agrees with the previous definition of the rank of a matrix. 


3.13 LU DECOMPOSITION 


Suppose A is a nonsingular matrix that can be brought into (upper) triangular form U using only row- 
addition operations; that is, suppose A can be triangularized by the following algorithm, which we write 
using computer notation. 

ALGORITHM 3.6: The input is a matrix A and the output is a triangular matrix U. 

Step 1. Repeat for i — 1,21: 

Step 2. Repeat for j — i + 1, i + 2,..., n 

(a) Set :=-fly/o,-,, 

(b) Set Rj : = m lJ R l + R f 

[End of Step 2 inner loop.] 

[End of Step 1 outer loop.] 

The numbers my are called multipliers. Sometimes we keep track of these multipliers by means of the 
following lower triangular matrix L: 


1 

0 

0 

0 

0 

~ m 2\ 

1 

0 

0 

0 

-m 3l 

-m i2 

1 

0 

0 

~ m nl 

~m„ 2 

~m n 3 • • • 

• 1 

1 


That is, L has l’s on the diagonal, 0’s above the diagonal, and the negative of the multiplier m (/ as its 
(/-entry below the diagonal. 

The above matrix L and the triangular matrix U obtained in Algorithm 3.6 give us the classical LU 
factorization of such a matrix A. Namely, 

THEOREM 3.22: Let A be a nonsingular matrix that can be brought into triangular form U using only 
row-addition operations. Then A = LU, where L is the above lower triangular matrix 
with l’s on the diagonal, and U is an upper triangular matrix with no 0’s on the 
diagonal. 
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EXAMPLE 3.22 SupposeA = 


1 2 -3 

-3 -4 13 

2 1 -5 


. We note that A may be reduced to triangular form by the operations 


“Replace R 2 by 3/?, + R 2 ”; “Replace R 2 by — 2 R 1 + R 3 ”; and then “Replace R 2 by 4 R 2 + R 2 ” 
That is. 


'1 

2 

—3" 


'1 

2 

-3' 

0 

2 

4 


0 

2 

4 

0 

-3 

1 


0 

0 

7 


This gives us the classical factorization A = LU, where 



' 1 

0 

O' 



'1 

2 

-3' 

L = 

-3 

1 

0 

and 

U = 

0 

2 

4 


2 

3 

2 

l 



_0 

0 

1 _ 


We emphasize: 

(1) The entries —3, 2, — 4 in L are the negatives of the multipliers in the above elementary row operations. 

(2) U is the triangular form of A. 


Application to Systems of Linear Equations 

Consider a computer algorithm M. Let C(n) denote the running time of the algorithm as a function of the 
size n of the input data. [The function C(n) is sometimes called the time complexity or simply the 
complexity of the algorithm M.] Frequently, C(n') simply counts the number of multiplications and 
divisions executed by M, but does not count the number of additions and subtractions because they take 
much less time to execute. 

Now consider a square system of linear equations AX = B, where 

A [r/y], X [xj, • • •, x n ] , B ? ? b n \ 

and suppose A has an LU factorization. Then the system can be brought into triangular form (in order to 
apply back-substitution) by applying Algorithm 3.6 to the augmented matrix M — [A, B] of the system. 
The time complexity of Algorithm 3.6 and back-substitution are, respectively, 

C(n) « ' n 3 and C(n) « 2 n 2 

where n is the number of equations. 

On the other hand, suppose we already have the factorization A = LU. Then, to triangularize the 
system, we need only apply the row operations in the algorithm (retained by the matrix L) to the column 
vector B. In this case, the time complexity is 

C(n) « \n 2 

Of course, to obtain the factorization A = LU requires the original algorithm where C(n) « ( n \ Thus, 
nothing may be gained by first finding the LU factorization when a single system is involved. However, 
there are situations, illustrated below, where the LU factorization is useful. 

Suppose, for a given matrix A, we need to solve the system 


AX = B 
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repeatedly for a sequence of different constant vectors, say B l ,B 2 , ■.. ,B k . Also, suppose some of the />’, 
depend upon the solution of the system obtained while using preceding vectors B y In such a case, it is 
more efficient to first find the LU factorization of A, and then to use this factorization to solve the system 
for each new B. 


EXAMPLE 3.23 Consider the following system of linear equations: 


x + 2y + z = k x 

1 2 1 " 


V 

2x + 3 y + 3~ = k 2 or AX = B, where A = 

2 3 3 

and B — 

^2 

3x + 10_y + 2z = k 3 

1 

1 

U> 

o 

K> 

i_ 


_*3_ 


Suppose we want to solve the system three times where B is equal, say, to B 1 ,B 2 ,B 2 . Furthermore, suppose 
B j = [1,1, l] r , and suppose 

B j+l =Bj+Xj (fory =1,2) 


where Xj is the solution of AX = Bj. Here it is more efficient to first obtain the LU factorization of A and then use the 
LU factorization to solve the system for each of the B' s. (This is done in Problem 3.42.) 


SOLVED PROBLEMS 


Linear Equations, Solutions, 2x2 Systems 

3.1. Determine whether each of the following equations is linear: 

(a) 5.x + ly — 8yz = 16, (b) x + ny + ez — log 5, (c) 3x + ky — 8z = 16 

(a) No, because the product yz of two unknowns is of second degree. 

(b) Yes, because n, e, and log 5 are constants. 

(c) As it stands, there are four unknowns: x, y, z, k. Because of the term ky it is not a linear equation. However, 
assuming k is a constant, the equation is linear in the unknowns x, y, z. 

3.2. Determine whether the following vectors are solutions of x, + 2x 2 — 4x 3 + 3x 4 =15: 

(a) u. — (3,2,1,4) and (b) v = (1,2,4,5). 

(a) Substitute to obtain 3 + 2(2) — 4(1)+ 3(4) = 15. or 15 = 15; yes, it is a solution. 

(b) Substitute to obtain 1 + 2(2) — 4(4) + 3(5) = 15, or 4 = 15; no, it is not a solution. 

3.3. Solve (a) ex = n, (b) 3x — 4 — x — 2x + 3, (c) 7 + 2x — 4 = 3.x + 3 — x 

(a) Because e ^ 0, multiply by 1/e to obtain x = n/e. 

(b) Rewrite in standard form, obtaining Ox = 7. The equation has no solution. 

(c) Rewrite in standard form, obtaining Ox = 0. Every scalar k is a solution. 

3.4. Prove Theorem 3.4: Consider the equation ax = b. 

(i) If a / 0, then x = b/a is a unique solution of ax = b. 

(ii) If a — 0 but b ^ 0, then ax = b has no solution. 

(iii) If a = 0 and b = 0, then every scalar k is a solution of ax = b. 

Suppose fl/0. Then the scalar b/a exists. Substituting b/a in ax = b yields a{b/a) = b, or b = b; 
hence, b/a is a solution. On the other hand, suppose x 0 is a solution to ax = b, so that ax 0 = b. Multiplying 
both sides by 1/a yields x 0 = b/a. Hence, b/a is the unique solution of ax = b. Thus, (i) is proved. 

On the other hand, suppose a = 0. Then, for any scalar k, we have ak = 0k = 0. If b ^ 0, then ak ^ b. 
Accordingly, k is not a solution of ax = b, and so (ii) is proved. If b = 0, then ak = b. That is, any scalar k is a 
solution of ax = b, and so (iii) is proved. 
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3.5. Solve each of the following systems: 

, . 2x — 5v = 11 ... 2x — 3y = 8 , . 2x — 3y = 8 

(a) 3x Ay — 5 b) —6x + 9y = 6 (C) -Ax + 6 y =-16 

(a) Eliminate x from the equations by forming the new equation L = —3 + 2L 2 . This yields the equation 

23y = —23, and so y =—1 
Substitute y = — 1 in one of the original equations, say L l , to get 

2x — 5(— 1) = 11 or 2x+5 = ll or 2x = 6 or x=3 

Thus, x = 3, y = —1 or the pair u = (3,-1) is the unique solution of the system. 

(b) Eliminate x from the equations by forming the new equation L = 3L[ + L 2 . This yields the equation 

Ox + Oy = 30 

This is a degenerate equation with a nonzero constant; hence, this equation and the system have no 
solution. (Geometrically, the lines corresponding to the equations are parallel.) 

(c) Eliminate x from the equations by forming the new equation L = 2L x + L 2 . This yields the equation 

Ox + Oy = 0 

This is a degenerate equation where the constant term is also zero. Thus, the system has an infinite 
number of solutions, which correspond to the solution of either equation. (Geometrically, the lines 
corresponding to the equations coincide.) 

To find the general solution, set y = a and substitute in L x to obtain 

2x — 3a = 8 or 2x = 3a + 8 or x = 2 a + 4 
Thus, the general solution is 

x = |a + 4, y = a or u= (|a + 4, a) 

where a is any scalar. 

3.6. Consider the system 

x + ay — 4 
ax + 9v = b 

(a) For which values of a does the system have a unique solution? 

(b) Find those pairs of values (a,b) for which the system has more than one solution. 

(a) Eliminate x from the equations by forming the new equation L = —aL l + L 2 . This yields the equation 

(9 — a 2 )y = b — 4a (1) 

The system has a unique solution if and only if the coefficient of y in (1) is not zero—that is, if 
9 — a 2 7 ^ 0 or if a ^ ±3. 

(b) The system has more than one solution if both sides of (1) are zero. The left-hand side is zero when a = ±3. 
When a = 3, the right-hand side is zero when b — 12 = 0 or b = 12. When a = —3, the right-hand side is 
zero when b + 12 — 0 or b = —12. Thus, (3,12) and (—3, —12) are the pairs for which the system has more 
than one solution. 

Systems in Triangular and Echelon Form 

3.7. Determine the pivot and free variables in each of the following systems: 

2xj — 3x 2 — 6 x 3 — 5x 4 + 2x 5 = 7 2x — 6 _y + lz — 1 x + 2y — 3z = 2 

x 3 + 3x 4 — 7x 5 = 6 Ay + 3z — 8 2x + 3_y + z = A 

x 4 — 2x 5 = 1 2z — A 3x + Ay + 5z = 8 

(a) (b) (c) 

(a) In echelon form, the leading unknowns are the pivot variables, and the others are the free variables. Here x,, 
x 3 , x 4 are the pivot variables, and x 2 and x 5 are the free variables. 
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(b) The leading unknowns are x, y, z, so they are the pivot variables. There are no free variables (as in any 
triangular system). 

(c) The notion of pivot and free variables applies only to a system in echelon form. 


3.8. Solve the triangular system in Problem 3.7(b). 

Because it is a triangular system, solve by back-substitution. 

(i) The last equation gives z = 2. 

(ii) Substitute z = 2 in the second equation to get Ay + 6 = 8 or y = i. 

(iii) Substitute z = 2 and y = ( in the first equation to get 

2x — 6 + 7(2) = 1 or 2x + 11 = 1 or x = — 5 

Thus, x = — 5, y = z = 2 or m = (—5,|,2) is the unique solution to the system. 


3.9. Solve the echelon system in Problem 3.7(a). 

Assign parameters to the free variables, say x 2 = a and x 5 = b, and solve for the pivot variables by back- 
substitution. 


(i) Substitute x 5 = b in the last equation to get x 4 — 2b = 1 or x 4 = 2b + 1. 

(ii) Substitute x 5 = b and x 4 = 2b + 1 in the second equation to get 


x 3 + 3(2 b + 1) — lb = 6 or x 3 — b + 3 = 6 or x 3 = b + 3 
(iii) Substitute x 5 = b, x 4 = 2b + 1, x 3 = b + 3, x 2 = a in the first equation to get 

2x x — 3a — 6 (b + 3) — 5(2 b + 1) + 2b = 1 or 2x 1 — 3ci — 14 b — 23 = 7 
or x, = + lb + 15 

Thus, 

3 

x l = -a + 7^+15, x 2 = a, x 3 = b + 3, x 4 = 2b + 1, x 5 = b 

or u = + lb + 15, a, b + 3, 2b + l, b'j 

is the parametric form of the general solution. 

Alternatively, solving for the pivot variable x,, x 3 , x 4 in terms of the free variables x 2 and x 5 yields the 
following free-variable form of the general solution: 


x 1 = -x 2 + 7x 5 + 15, x 3 = x s + 3, x 4 = 2 x 5 + 1 


3.10. Prove Theorem 3.6. Consider the system (3.4) of linear equations in echelon form with r equations 
and n unknowns. 

(i) If r = n, then the system has a unique solution. 

(ii) If r < n, then we can arbitrarily assign values to the n — r free variable and solve uniquely for the r 
pivot variables, obtaining a solution of the system. 

(i) Suppose r = n. Then we have a square system AX = B where the matrix A of coefficients is (upper) 
triangular with nonzero diagonal elements. Thus, A is invertible. By Theorem 3.10, the system has a 
unique solution. 

(ii) Assigning values to the n — r free variables yields a triangular system in the pivot variables, which, by 
(i), has a unique solution. 
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Gaussian Elimination 

3.11. Solve each of the following systems: 


x + 2y — 4z = 
2x + 5y — 9z — 
3x — 2y + 3z = 
(a) 


-4 

X 

-10 

—3x 

11 

5x 


+ 2y — 3z — — 1 
+ y — 2z = —7 
+ 3y — 4z = 2 
(b) 


x + 2y — 3z = 1 
2x + 5y — 8z = 4 
3x + 8 y — 13 z = 7 
’ (c) 


Reduce each system to triangular or echelon form using Gaussian elimination: 

(a) Apply "Replace L 2 by — 2L 1 + L 2 and "Replace L 3 by — 3L, + L 3 ” to eliminate x from the second and 
third equations, and then apply "Replace L 3 by 8 L 2 + L 3 ” to eliminate y from the third equation. These 
operations yield 

x + 2 v — 4: = —4 x + 2 y — 4z = —4 

y — z = —2 and then y — z = —2 

-8y+15z=23 7z = 7 

The system is in triangular form. Solve by back-substitution to obtain the unique solution u = (2, —1,1). 

(b) Eliminate x from the second and third equations by the operations "Replace L 2 by 3 L { + L 2 ” and "Replace 
L 3 by —5L| + L 3 .” This gives the equivalent system 


x + 2y — 3 z = — 1 
ly - llz = -10 
—7y+llz= 7 

The operation “Replace L 3 by L 2 + L 3 " yields the following degenerate equation with a nonzero 
constant: 


Ox + Oy + Oz = —3 


This equation and hence the system have no solution. 


(c) Eliminate x from the second and third equations by the operations "Replace L 2 by —2L[ + L 2 " and 
"Replace L 3 by —3 L 1 + L 3 .” This yields the new system 


x + 2y — 3z = 1 
y — 2z = 2 
2y — 4z = 4 


or 


x T 2y — 3z = 1 
y — 2z = 2 


(The third equation is deleted, because it is a multiple of the second equation.) The system is in echelon 
form with pivot variables x and y and free variable z. 

To find the parametric form of the general solution, set z = a and solve for x and y by back- 
substitution. Substitute z = a in the second equation to get y = 2 + 2a. Then substitute z = a and 
y = 2 + 2a in the first equation to get 

x + 2(2 + 2a) — 3a = \ or x + 4 + a= l or x = —3 — a 
Thus, the general solution is 

x = —3 — a, y = 2 + 2a, z = a or u = (—3 — a, 2 + 2a, a) 
where a is a parameter. 

3.12. Solve each of the following systems: 

x l — 3x 2 + 2x 3 — x 4 + 2x s = 2 X] + 2x 2 — 3x 3 + 4x 4 = 2 

3x| — 9x 2 + 7x 3 — x 4 + 3x 5 = 7 2x l + 5x 2 — 2x 3 + x 4 = 1 

2x| — 6x 2 + 7x 3 + 4x 4 — 5x 5 = 7 5xj + 12x 2 — 7x 3 + 6x 4 = 3 

(a) ~ (b) 


Reduce each system to echelon form using Gaussian elimination: 
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(a) Apply "Replace L 2 by —3 L x + L 2 ” and "Replace L 3 by — 2L { + L 3 ” to eliminate x from the second and 
third equations. This yields 


.X] — 3x 2 + 2x 3 — x 4 + 2x 5 = 2 
x 3 + 2x 4 — 3x 5 = 1 
3x 3 + 6x 4 — 9*5 = 3 


or 


Xj — 3*2 + 2 x 3 — x 4 + 2x 5 = 2 
x 3 + 2x 4 — 3x 5 = 1 


(We delete L 3 , because it is a multiple of L 2 .) The system is in echelon form with pivot variables x 1 and 
x 3 and free variables x 2 ,x 4 ,x 5 . 

To find the parametric form of the general solution, set x 2 = a, x 4 = b, x 5 = c, where a, b, c are 
parameters. Back-substitution yields x 3 = 1 — 2Z? + 3c and Xj = 3a + 5b — 8 c. The general solution is 

x 3 = 3a + 5b — 8 c, x 2 = a, x 3 = 1 — 2 b + 3c, x 4 = b, x 5 = c 


or, equivalently, u = (3 a + 5 b — 8 c, a, 1—2 b + 3c, b, c). 

(b) Eliminate X[ from the second and third equations by the operations "Replace L 2 by —2 L x + L 2 ” and 
“Replace L 3 by — 5Lj + L 3 ." This yields the system 

x l + 2x 2 — 3x 3 + 4x 4 = 2 

x 2 + 4x 3 — 7x 4 = — 3 
2x 2 + 8x 3 — 14x 4 = —7 


The operation "Replace L 3 by —2 L 2 + L 3 ” yields the degenerate equation 0 = — l. Thus, the system has 
no solution (even though the system has more unknowns than equations). 


3.13. Solve using the condensed format: 


2y + 3z= 3 
x + y + z— 4 
4x + 8y — 3z = 35 

The condensed format follows: 



Number 

Equation 

Operation 

(2) 

(/) 

2y + 3z= 3 

L\ <-> L 2 

(1) 

(?) 

x + y + z = 4 

L\ *-> L 2 


(3) 

4x + 8 v — 3z = 35 



(3') 

4y — 7z = 19 

Replace L 3 by — 4Lj + L 3 


(3") 

’ - 13z= 13 

Replace L 3 by — 2 L 2 + L 3 


Here (1), (2), and (3") form a triangular system. (We emphasize that the interchange of L x and L 2 is 
accomplished by simply renumbering L x and L 2 as above.) 

Using back-substitution with the triangular system yields z = — 1 from L 3 , y = 3 from L 2 , and x = 2 
from Lj. Thus, the unique solution of the system is x = 2, y = 3, z = — 1 or the triple u = (2, 3, —1). 

3.14. Consider the system 

x + 2y+ z = 3 
ay + 5z = 10 
2x + ly + az = b 

(a) Find those values of a for which the system has a unique solution. 

(b) Find those pairs of values ( a, b) for which the system has more than one solution. 

Reduce the system to echelon form. That is, eliminate x from the third equation by the operation 
"Replace L 3 by —2L X + L 3 ” and then eliminate y from the third equation by the operation 
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“Replace L 3 by —3 L 2 + aL 3 .” This yields 

x + 2y + z = 3 x+2y+ z = 3 

ay + 5z = 10 and then ay + 5z= 10 

3v + (a — 2)z = b — 6 (a 2 — 2a — 15)z = ab — 6a — 30 

Examine the last equation ( a 2 — 2a — 15)z = ab — 6a — 30. 

(a) The system has a unique solution if and only if the coefficient of z is not zero; that is, if 

a 2 — 2a — 15 = (a — 5 )(a + 3) ^ 0 or a ^ 5 and a ^ —3. 

(b) The system has more than one solution if both sides are zero. The left-hand side is zero when a = 5 or 
a = —3. When a = 5, the right-hand side is zero when 5b — 60 = 0, or b = 12. When a = —3, the right- 
hand side is zero when —3b — 12 = 0, or b — —4. Thus, (5,12) and (—3, —4) are the pairs for which the 
system has more than one solution. 


Echelon Matrices, Row Equivalence, Row Canonical Form 
3 . 15 . Row reduce each of the following matrices to echelon form: 


II 

'1 2 -3 O' 
2 4-22 

to 

II 

"-4 1 -6' 

1 2 -5 


3 6-43 


G\ 

U> 

1 

4^ 


(a) Use a n = 1 as a pivot to obtain 0"s below a n ; that is, apply the row operations “Replace R 2 by 
—2R X + R 2 ” and “Replace R 3 by —3 R t + R 3 .” Then use a 23 = 4 as a pivot to obtain a 0 below a 23 \ that is, 
apply the row operation "Replace R 3 by —5 R 2 + 4R 3 .’' These operations yield 


'i 

2 

-3 

O' 


'1 

2 

-3 

O' 

0 

0 

4 

2 

~ 

0 

0 

4 

2 

0 

0 

5 

3 


0 

0 

0 

2 


The matrix is now in echelon form. 

(b) Hand calculations are usually simpler if the pivot element equals 1. Therefore, first interchange R x and R 2 . 
Next apply the operations "Replace R 2 by 4 R t + R 2 " and “Replace R 3 by —6 R x + R 3 ”: and then apply the 
operation “Replace R 3 by R 2 + R 3 .” These operations yield 


1 

2 

-5' 


'1 

2 

-5' 


'1 

2 

-5' 

-4 

1 

-6 


0 

9 

-26 

~ 

0 

9 

-26 

6 

3 

-4 


0 

-9 

26 


0 

0 

0 


The matrix is now in echelon form. 


3 . 16 . Describe the pivoting row-reduction algorithm. Also describe the advantages, if any, of using this 
pivoting algorithm. 

The row-reduction algorithm becomes a pivoting algorithm if the entry in column j of greatest absolute 
value is chosen as the pivot a Xji and if one uses the row operation 

(- a ijJ a ij l ) R i+ R i^ R i 

The main advantage of the pivoting algorithm is that the above row operation involves division by the 
(current) pivot , and, on the computer, roundoff errors may be substantially reduced when one divides by a 
number as large in absolute value as possible. 

2-2 2 1 

-3 6 0 -1 

1 -7 10 2 


3 . 17 . Let A = 


. Reduce A to echelon form using the pivoting algorithm. 
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First interchange 7?, and R 2 so that —3 can be used as the pivot, and then apply the operations “Replace R 2 by 
|/?j + R 2 ” and “Replace 7? 3 by ! R t + R 3 .” These operations yield 


"-3 

6 

0 

-r 


"-3 

6 

0 

-T 

2 

-2 

2 

i 


0 

2 

2 

1 

3 

1 

-7 

10 

2. 


0 

-5 

10 

5 

3 


Now interchange R 2 and 7? 3 so that —5 can be used as the pivot, and then apply the operation “Replace 7? 3 by 
? R 2 + 7? 3 .” We obtain 



"-3 

6 

0 

-1“ 


'-3 

6 

0 

-1" 

A ~ 

0 

-5 

10 

5 

! 

3 _ 


0 

-5 

10 

5 

3 


0 

2 

2 


0 

0 

6 

1 


The matrix has been brought to echelon form using partial pivoting. 


3 . 18 . Reduce each of the following matrices to row canonical form: 


'2 

2 - 

1 

6 

4' 


'5 

-9 

6' 

4 

4 

1 

10 

13 

, (b) B = 

0 

2 

3 

8 

8 - 

1 

26 

23 


0 

0 

7 


(a) First reduce A to echelon form by applying the operations “Replace R 2 by — 2R X + R 2 ” and "Replace R 3 by 
—4R[ +R 3 ,” and then applying the operation “Replace R 2 by —R 2 + R 3 ." These operations yield 


'2 

2 

-1 

6 

4' 


'2 

2 

-1 

6 

4" 

0 

0 

3 

-2 

5 


0 

0 

3 

-2 

5 

0 

0 

3 

2 

7 


0 

0 

0 

4 

2 


Now use back-substitution on the echelon matrix to obtain the row canonical form of A. Specifically, first 
multiply 7? 3 by \ to obtain the pivot « 34 = 1, and then apply the operations “Replace R 2 by 27? 3 + 7? 2 ” and 
“Replace R j by —6/f, + R l .” These operations yield 



'2 

2 

-1 

6 

4 " 


'2 

2 

-1 

0 

r 

A ~ 

0 

0 

3 

-2 

5 


0 

0 

3 

0 

6 


0 

0 

0 

1 

i 

2. 


0 

0 

0 

1 

i 

2_ 


Now multiply R 2 by |, making the pivot a 23 = 1, and then apply "Replace R x by R 2 + R 1 ” yielding 


'2 

2 

-1 

0 

r 


'2 

2 

0 

0 

3" 

0 

0 

1 

0 

2 


0 

0 

1 

0 

2 

0 

0 

0 

1 

1 

2. 


0 

0 

0 

1 

1 

2 _ 


Finally, multiply R t by so the pivot a n = 1. Thus, we obtain the following row canonical form of A: 


A ~ 


i iOOf 
0 0 10 2 
0 0 0 1 \ 


(b) Because B is in echelon form, use back-substitution to obtain 


'5 

-9 

6" 


'5 

-9 

0" 


'5 

-9 

O' 


'5 

0 

o' 


'1 

0 

0" 

0 

2 

3 

~ 

0 

2 

0 

~ 

0 

1 

0 

~ 

0 

1 

0 

~ 

0 

1 

0 

0 

0 

1 _ 


0 

0 

1 _ 


0 

0 

1 


0 

0 

1 


0 

0 

1 


The last matrix, which is the identity matrix 7, is the row canonical form of B. (This is expected, because 
B is invertible, and so its row canonical form must be I.) 


3 . 19 . Describe the Gauss-Jordan elimination algorithm, which also row reduces an arbitrary matrix A to 
its row canonical form. 
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The Gauss-Jordan algorithm is similar in some ways to the Gaussian elimination algorithm, except that 
here each pivot is used to place 0’s both below and above the pivot, not just below the pivot, before working 
with the next pivot. Also, one variation of the algorithm first normalizes each row—that is, obtains a unit 
pivot—before it is used to produce 0’s in the other rows, rather than normalizing the rows at the end of the 
algorithm. 


3 . 20 . Let A = 


1-23 12 

1 14-13 

2 59-28 


Use Gauss-Jordan to find the row canonical form of A. 


Use a,, = 1 as a pivot to obtain 0’s below a u by applying the operations “Replace R 2 by —R, + R 2 ” and 
"Replace R i by —2 R x + R 3 .” This yields 


A ~ 


1-23 12 

0 31-21 

0 93-44 


Multiply R 2 by 1 to make the pivot a 22 = L and then produce 0’s below and above a 22 by applying the 
operations "Replace R 3 by —9 R 2 + R 3 ” and "Replace R x by 2 R 2 + R l .” These operations yield 


"1 

-2 

3 

1 

2" 


'i 

0 

11 

3 

1 

3 

8 " 

3 

0 

1 

l 

3 

2 

3 

l 

3 


0 

1 

1 

3 

2 

3 

1 

3 

0 

9 

3 

-4 

4_ 


_0 

0 

0 

2 

1 


Finally, multiply R 3 by 4 to make the pivot a 34 = 1, and then produce 0’s above a 34 by applying the 
operations "Replace R 2 by =R 3 + R 2 ” and "Replace R t by f A’ 3 + R 3 .” These operations yield 


'i 

0 

11 

3 

1 

3 

81 

3 


"l 

0 

11 

3 

0 

17 1 

6 

0 

1 

1 

3 

2 

3 

1 

3 


0 

1 

1 

3 

0 

2 

3 

0 

0 

0 

1 

1 

2 J 


0 

0 

0 

l 

1 

2 J 


which is the row canonical form of A. 


Systems of Linear Equations in Matrix Form 

3 . 21 . Find the augmented matrix M and the coefficient matrix A of the following system: 

x + 2y — 3z = 4 
3_y — 4z + lx — 5 
6- + 8x - 9y = I 


First align the unknowns in the system, and then use the aligned system to obtain M and A. We have 


x + 2y — 3z = 4 

'1 2 -3 4' 


'1 2 -3' 

lx + 3y — 4z = 5 ; then M = 

7 3-4 5 

and A = 

7 3-4 

8x — 9 y + 6z = 1 

8-9 6 1 


8-9 6 


3 . 22 . Solve each of the following systems using its augmented matrix M: 

1 

3 
7 


(a) Reduce the augmented matrix M to echelon form as follows: 



'1 

2 

-1 

3' 


'1 

2 

-1 

3' 


"1 

2 

-1 

3' 

M = 

1 

3 

1 

5 

~ 

0 

1 

2 

2 


0 

1 

2 

2 


3 

8 

4 

17 


0 

2 

7 

8 


0 

0 

3 

4 


x+ 2 y - z— 3 
x + 3y + z— 5 
3x + 8y + 4z = 17 
(a) 


x — 2y + 4z = 2 
2x - 3y + 5z = 3 
3x — 4y + 6z = 7 
'(b) 


x + y + 3z 
2x + 3y - z 
5x + ly + z 
(c) 
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Now write down the corresponding triangular system 


x + 2 ;y - z = 3 
y + 2z = 2 
3z = 4 


and solve by back-substitution to obtain the unique solution 


x = n 

A, 3 5 


> = -§, Z = | or « = (V>-5>5) 

Alternately, reduce the echelon form of M to row canonical form, obtaining 



"1 

2 

-1 

3“ 


'1 

2 

0 

13 “1 
3 


"1 

0 

0 

17 "I 

3 

M ~ 

0 

1 

2 

2 

~ 

0 

1 

0 

2 

3 

~ 

0 

1 

0 

2 

3 


0 

0 

1 

4 

3 J 


0 

0 

1 

4 

3J 


0 

0 

1 

4 

3 J 


This also corresponds to the above solution. 

(b) First reduce the augmented matrix M to echelon form as follows: 



"1 

-2 

4 

2' 


'1 

-2 

4 

2' 


'1 

-2 

4 

2' 

M = 

2 

-3 

5 

3 


0 

1 

-3 

-1 


0 

1 

-3 

-1 


3 

-4 

6 

7 


0 

2 

-6 

1 


0 

0 

0 

3 


The third row corresponds to the degenerate equation (k + Ov + Oz = 3, which has no solution. Thus, 
“DO NOT CONTINUE." The original system also has no solution. (Note that the echelon form indicates 
whether or not the system has a solution.) 

(c) Reduce the augmented matrix M to echelon form and then to row canonical form: 


M = 


1 

2 

5 


1 

3 

7 


3 1 

-1 3 
1 7 


11 3 1 

0 1 -7 1 

0 2 -14 2 


1 0 10 0 

0 1-71 


(The third row of the second matrix is deleted, because it is a multiple of the second row and will result in 
a zero row.) Write down the system corresponding to the row canonical form of M and then transfer the 
free variables to the other side to obtain the free-variable form of the solution: 


x + lOz = 0 
y— 7z = 1 


and 


x = — 10z 
y= 1 + 7z 


Here z is the only free variable. The parametric solution, using z = a, is as follows: 


x=— 10u, y = 1 + la, z = a or u = (—10a, 1 + la, a) 


3 . 23 . Solve the following system using its augmented matrix M: 

X] + 2x 2 — 3x 3 — 2x 4 + 4x 5 = 1 

2*J + 5x 2 — 8x 3 — x 4 + 6x5 = 4 

X] + 4x 2 — 7x 3 + 5x 4 + 2x 5 = 8 

Reduce the augmented matrix M to echelon form and then to row canonical form: 



'1 

2 

-3 

-2 

4 

r 


'1 2 

-3 

-2 

4 

r 


'1 

2 

-3 

-2 

4 

r 

M = 

2 

5 

-8 

-1 

6 

4 

~ 

0 1 

-2 

3 

-2 

2 

~ 

0 

1 

-2 

3 

-2 

2 


1 

4 

-7 

5 

2 

8_ 


O 

hO 

-4 

7 

-2 

7 


0 

0 

0 

1 

2 

3_ 


'1 

2 

-3 

0 

8 

7' 


'1 0 

1 

0 

24 

21" 







~ 

0 

1 

-2 

0 

-8 

-7 

~ 

0 1 

-2 

0 

-8 

-7 








0 

0 

0 

1 

2 


1 


0 

0 

0 

1 

2 


1 








Write down the system corresponding to the row canonical form of M and then transfer the free variables to 
the other side to obtain the free-variable form of the solution: 

X| + x 3 + 24x 5 = 21 X! = 21 — x 3 — 24x 5 

x 2 — 2x 3 — 8 x 5 = —7 and x 2 = —7 + 2x 3 + 8 x 5 

x 4 + 2x 5 =3 x 4 = 3 — 2x 5 
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Here x l ,x 2 ,x 4 are the pivot variables and x 3 and x s are the free variables. Recall that the parametric form of 
the solution can be obtained from the free-variable form of the solution by simply setting the free variables 
equal to parameters, say x 3 = a, x 5 = b. This process yields 

Xj =21 — a — 24 b, x 2 = —7 + 2 a+ 8b, x 3 = a, x 4 = 3 — 2b, x s = b 
or u = (21 — a — 24 b, —l + 2a + 8b, a, 3 — 2b, b) 

which is another form of the solution. 


Linear Combinations, Homogeneous Systems 

3 . 24 . Write v as a linear combination of u x ,u 2 ,u 3 , where 

(a) v= (3,10,7) and iq = (l,3,-2),tq = (1,4,2 ),u 3 = (2,8,1); 

(b) v = (2,7,10) and «, = (1,2,3), u 2 = (1,3,5), u 3 = (1,5,9); 

(c) v — (1,5,4) and tq = (1,3, —2), u 2 = (2,7, —1), n 3 = (1,6,7). 

Find the equivalent system of linear equations by writing v = xu , + yu 2 + z.u 3 . Alternatively, use the 
augmented matrix M of the equivalent system, where M = [u l ,u 2 ,u 3 ,v\. (Here u x ,u 2 ,u 3 ,v are the columns 
of AT.) 

(a) The vector equation v = xiq +vu 2 + zu 3 for the given vectors is as follows: 


3" 


r 


V 


' 2 ' 


x + y + 2z 

10 

= X 

3 

+y 

4 

+ z 

8 

= 

3x + 4 y + 8 z 

7 


-2 


2 


1 


— 2 x + 2 y + z 


Form the equivalent system of linear equations by setting corresponding entries equal to each other, and 
then reduce the system to echelon form: 

x + y + 2z = 3 x + y + 2z = 3 x + y + 2z = 3 

3x + 4y + 8 z = 10 or y + 2z = 1 or y + 2z = 1 

— 2x + 2y+ z= 7 4y + 5z=13 —3z = 9 

The system is in triangular form. Back-substitution yields the unique solution x = 2, y = 7, z = —3. 
Thus, v = 2m| + lu 2 — 3 u 3 . 

Alternatively, form the augmented matrix M = [u l ,u 2 ,u 3 ,v\ of the equivalent system, and reduce 
M to echelon form: 


1 

1 

2 

3' 


'1 

1 

2 

3' 


'1 

1 

2 

3' 

3 

4 

8 

10 

~ 

0 

1 

2 

1 

~ 

0 

1 

2 

1 

-2 

2 

1 

7 


0 

4 

5 

13 


0 

0 

-3 

9 


The last matrix corresponds to a triangular system that has a unique solution. Back-substitution yields the 
solution x=2,y = l,z = —3. Thus, v = 2u l + lu 2 — 3 u 3 . 

(b) Form the augmented matrix M = [iq, u 2 , u 3 , v] of the equivalent system, and reduce M to the echelon form: 



'1 

1 

1 

2 ' 


'1 

1 

1 

2 ' 


'1 

1 

1 

2 ' 

M = 

2 

3 

5 

7 


0 

1 

3 

3 

~ 

0 

1 

3 

3 


3 

5 

9 

10 


0 

2 

6 

4 


0 

0 

0 

-2 


The third row corresponds to the degenerate equation Ox + Oy + 0z = —2, which has no solution. Thus, 
the system also has no solution, and v cannot be written as a linear combination of u x , u 2 , u 3 . 

(c) Form the augmented matrix M = [m, , u 2 ,u 3 , v] of the equivalent system, and reduce M to echelon form: 


1 

2 

1 

1 " 


"1 

2 

1 

r 


'1 

2 

1 

r 

3 

7 

6 

5 


0 

1 

3 

2 


0 

1 

3 

2 

-2 

-1 

7 

4 


0 

3 

9 

6 


0 

0 

0 

0 


M = 



CHAPTER 3 Systems of Linear Equations 


99 


The last matrix corresponds to the following system with free variable z: 

x + 2 v + z = 1 
y + 3: = 2 

Thus, v can be written as a linear combination of u t , n 2 , u 3 in many ways. For example, let the free 
variable z = 1, and, by back-substitution, we get y = — 2 and x = 2. Thus, v = 2u x — 2u 2 + m 3 . 


3.25. Let U\ = (1,2,4), u 2 = (2, —3,1), « 3 = (2,1, —1) in R 3 . Show that u 1 ,u 2 ,u 2 are orthogonal, and 
write was a linear combination of u x ,u 2 ,u 2 , where (a) v = (7,16,6), (b) v — (3,5,2). 

Take the dot product of pairs of vectors to get 


(a) 


(b) 


«i • t <2 = 2 — 6 + 4 = 0, u x ■ u 3 = 2 + 2 — 4 = 0, u 2 ■ n 3 = 4 — 3 — 1 = 0 

Thus, the three vectors in R 3 are orthogonal, and hence Fourier coefficients can be used. That is, 
v = xui + yu 2 + zu 3 , where 

V ■ H] V ■ Mi V ■ M 3 

x = -, y = -, z =-- 

H i i^2 * ^2 * Mg 


We have 


7 + 32 + 24 
1 + 4+16 



Thus, i) = 3ttj — 2 m 2 + 4n 3 . 
We have 


_ 3+ 10+8 _ 21 _ 

X “ 1+4+16 “ 2l “ 
Thus, v = Mj — 3 n 2 + |m 3 . 


14- 48 + 6 _ -28 
4 + 9+1 “14^ 


14+ 16-6 _ 24 


6 — 15 + 2^—7 _ 1 _ 6 + 5 — 2_9_3 

4 + 9+1 “ U “ ~~2’ " “ 4+1 + 1 “ 6 “ 2 


2 x[ 

3x, 

5x| 


4x 2 — 5x 3 
6 .r 2 — 7x 3 
10 x 2 — llx 3 
(a) 


3.26. Find the dimension and a basis for the general solution W of each of the following homogeneous 
systems: 

3*4 = 0 x — 2y — 3z — 0 

4*4 = 0 2*+y + 3z = 0 

6*4 = 0 3* — 4y — 2z = 0 

(b) 

(a) Reduce the system to echelon form using the operations “Replace L 2 by —3L; + 2L 2 ," “Replace L 3 by 
—5L; + 2L 3 ,“ and then “Replace L 3 by —2 L 2 +L 3 .” These operations yield 

2*j + 4* 2 — 5* 3 + 3*4 = 0 

* 3 — *4 = 0 and 
3 * 3 — 3*4 = 0 

The system in echelon form has two free variables, * 2 and * 4 , so dim W = 2. A basis [u ^, u 2 ] for W may 
be obtained as follows: 

(1) Set * 2 = 1, * 4 = 0. Back-substitution yields * 3 = 0, and then *, = —2. Thus, u j = (—2,1,0, 0). 

(2) Set * 2 = 0, * 4 = 1. Back-substitution yields * 3 = 1, and then *, = 1. Thus, u 2 = (1,0,1,1). 

(b) Reduce the system to echelon form, obtaining 


2 * 


i + 4*9 — 5*3 + 3*4 — 0 

*3 — *4 = 0 


* — 2y — 3z = 0 * — 2y — 3z = 0 

5y + 9z = 0 and 5y + 9z = 0 

2y + 7z = 0 17z = 0 

There are no free variables (the system is in triangular form). Hence, dim W = 0, and W has no basis. 
Specifically, W consists only of the zero solution; that is, W = {0}. 


3.27. Find the dimension and a basis for the general solution W of the following homogeneous system 
using matrix notation: 

*! + 2*2 + 3*3 — 2*4 + 4*5 = 0 

2 * | + 4*2 + 8* 3 + *4 + 9*5 — 0 

3*! + 6* 2 + 13*3 + 4x 4 + 14*5 = 0 

Show how the basis gives the parametric form of the general solution of the system. 

When a system is homogeneous, we represent the system by its coefficient matrix A rather than by its 
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augmented matrix M, because the last column of the augmented matrix M is a zero column, and it will remain 
a zero column during any row-reduction process. 

Reduce the coefficient matrix A to echelon form, obtaining 


A = 


1 

2 

3 


2 

4 

6 


3-2 4 

8 1 9 
13 4 14 


123-24 
0 0 2 5 1 

004 10 2 


123-24 
0 0 2 5 1 


(The third row of the second matrix is deleted, because it is a multiple of the second row and will result in a 
zero row.) We can now proceed in one of two ways. 


(a) Write down the corresponding homogeneous system in echelon form: 


x 1 + 2x 2 + 3x 3 — 2x 4 + 4x 5 = 0 
2x 3 A 5x 4 T x 3 = 0 


The system in echelon form has three free variables, x 2 ,x 4 ,x 5 , so dim W = 3. A basis [iq, u 2 , n 3 ] for W 
may be obtained as follows: 

(1) Set x 2 = 1, x 4 = 0, x 5 = 0. Back-substitution yields x 3 = 0, and then x, = —2. Thus, 

Mj = (- 2 . 1 , 0 , 0 , 0 ). 

(2) Set Xt = 0, x 4 = 1, x 5 = 0. Back-substitution yields x 3 = — I, and then x l = Thus, 

«2 = (^, 0 ,— 1 , 1 , 0 ). 

(3) Set x 2 = 0, x 4 = 0, x 5 = 1. Back-substitution yields x 3 = — 1 and then x 3 = — |. Thus, 

M 3 = H|, 0 ,- 1 , 0 , 1 ). 

[One could avoid fractions in the basis by choosing x 4 = 2 in (2) and x 5 = 2 in (3), which yields 
multiples of u 2 and u 3 .] The parametric form of the general solution is obtained from the following linear 
combination of the basis vectors using parameters a, b , c: 

aui + bu 2 + cm 3 = (— 2 a + — §c, a, — 3 b — 3 c, b , c) 

(b) Reduce the echelon form of A to row canonical form: 


123-24 

0 0 ! § I 


1 2 3 
0 0 1 


Write down the corresponding free-variable solution: 


19 5 

2 2 
5 1 

2 2 


Xl = — 2x 2 + — x 4 - -x 5 

5 1 

*3 = ^ i x 4 ~ ^ x 5 

Using these equations for the pivot variables Xi and x 3 , repeat the above process to obtain a basis [u x , u 2 , m 3 ] 
for W. That is, set x 2 = 1, x 4 = 0, x 5 = 0 to get u 3 ; set x 2 = 0, x 4 = 1, x 5 = 0 to get u 2 , and set x 2 = 0, 
x 4 = 0 , x 5 = 1 to get n 3 . 


3.28. Prove Theorem 3.15. Let v 0 be a particular solution of AX = B, and let W be the general solution of 
AX = 0. Then U = v 0 + W = {v 0 + w : w G IT} is the general solution of AX = B. 

Let w be a solution of AX = 0. Then 

A(v 0 + w) = Av 0 + Aw = B + 0 = B 

Thus, the sum v 0 + w is a solution of AX = B. On the other hand, suppose v is also a solution of AX = B. 
Then 

A(v — v 0 ) = Av — Av 0 = B — B = 0 

Therefore, v — v 0 belongs to W. Because v = v 0 + (v — i> 0 ), we find that any solution of AX = B can be 
obtained by adding a solution of AX = 0 to a solution of AX = B. Thus, the theorem is proved. 
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Elementary Matrices, Applications 

3.29. Let e l ,e 2 ,e 3 denote, respectively, the elementary row operations 

“Interchange rows A, and R 2 , ” “Replace A, by 7A 3 , ” “Replace R 2 by —3A, + A 2 ” 

Find the corresponding three-square elementary matrices E l ,E 2 ,E 3 . Apply each operation to the 3 x 3 identity 
matrix / 3 to obtain 



"0 

1 

o' 


'1 

0 

o' 


1 

0 

o' 

Ei = 

1 

0 

0 

, e 2 = 

0 

1 

0 

, e 3 = 

-3 

1 

0 


0 

0 

1 


0 

0 

7 


0 

0 

1 


3.30. Consider the elementary row operations in Problem 3.29. 

(a) Describe the inverse operations ej 1 , e 2 l , ej 1 . 

(b) Find the corresponding three-square elementary matrices E \, E\, E' 3 . 

(c) What is the relationship between the matrices E [, E 2 , E\ and the matrices E x , E 2 , £’,? 


(a) The inverses of e,, e 2 , e 3 are, respectively, 

"Interchange rows A, and A 2 , ” “Replace A 3 by l A 3 , ” “Replace A 2 by 3A, + A 2 .” 


(b) Apply each inverse operation to the 3 x 3 identity matrix / 3 to obtain 



O 

o 


O 

O 


O 

O 

II 

1 0 0 

0 0 1 _ 

e’ 2 = 

0 1 0 

0 0 i_ 

II 

> cn 

3 1 0 

0 0 1 


(c) The matrices E[, E' 2 , E 3 are, respectively, the inverses of the matrices E x , E 2 , E 3 . 

3.31. Write each of the following matrices as a product of elementary matrices: 




"1 2 3' 


1 

1 2' 

1 -3 

-2 4 

, (b) B = 

0 1 4 

, (c) C = 

2 

3 8 



O 

O 


-3 

-1 2 


The following three steps write a matrix M as a product of elementary matrices: 


Step 1. Row reduce M to the identity matrix /, keeping track of the elementary row operations. 

Step 2. Write down the inverse row operations. 

Step 3. Write M as the product of the elementary matrices corresponding to the inverse operations. This 
gives the desired result. 

If a zero row appears in Step 1, then M is not row equivalent to the identity matrix /, and M cannot be written 
as a product of elementary matrices. 


(a) (1) We have 


r i -3i 


'1 -3' 


'1 -3' 


1 O' 

(N 

1 


0 -2 


0 1 


0 1 


( 2 ) 


where the row operations are, respectively, 

“Replace R 2 by 2R X + R 2 , ” “Replace A 2 by — 3 AS, ” 

Inverse operations: 

“Replace R 2 by —2A’, + A 2 ,” “Replace A 2 by —2A 2 , ’ 


(3) A = 


1 

O' 

'1 

o' 

1 

-3' 

-2 

1 

0 

-2 

0 

1 


“Replace A by 3A 2 + A]” 

“Replace A, by — 3A 2 + A,” 
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(b) (1) We have 


"12 3 
B = 0 1 4 

0 0 1 


1 2 0 
0 1 0 
0 0 1 


1 0 O' 

0 10 =/ 

0 0 1 


where the row operations are, respectively, 

"Replace R 2 by — 4 R 3 + R 2 , ” “Replace R l by — 3/?, = A’,.' 
(2) Inverse operations: 


"Replace R 2 by 4 R 3 + R 2 , ’ 


1 0 0 

(3) B = 0 1 4 

0 0 1 


1 0 3 
0 1 0 
0 0 1 


“Replace /? by 3 R 3 + R l: ' 

"12 0 ' 

0 1 0 
0 0 1 


(c) (1) First row reduce C to echelon form. We have 

1 1 2 1 P 

C= 2 3 8 ~ 0 

-3-12 0 


"Replace R t by —2 R 2 + A’,’ 
“Replace R { by 2 R 2 + A,” 


1 1 2 
0 1 4 

0 0 0 


In echelon form, C has a zero row. “STOP.” The matrix C cannot be row reduced to the identity 
matrix I, and C cannot be written as a product of elementary matrices. (We note, in particular, that 
C has no inverse.) 


'1 2-4 

3.32. Find the inverse of (a) A = —1 —1 5 

2 7-3 


'1 3-4 

(b) B = 1 5-1 

3 13 -6 


(a) Form the matrix M = [A,/] and row reduce M to echelon form: 


1 

2 -4 

i 1 

0 

0 " 


"1 

2 

-4 1 

1 

0 

O' 

1 

-1 5 

o 

1 

0 

~ 

0 

1 

1 

1 i 

1 

1 

0 

2 

7 -3 

0 

0 

1 


0 

3 

5 

-2 

0 

1 _ 

2 

-4 i 

1 

0 

01 








~ 0 1 1,1 10 

0 0 2 [ —5 —3 1 _ 

In echelon form, the left half of M is in triangular form; hence, A has an inverse. Further reduce M to row 
canonical form: 


"1 2 0 | -9 -6 

0 1 0 ; \ § 

.0 0 1 ;-§ -1 

The final matrix has the form [/, A -1 ]; thai 


-6 21 [1 0 0 1 -16 -11 3' 

| — i ~ 0 i 0 J \ \ -i 

-3 l 0 0 1 1 -5 -2 i 

2 2 J L wvyA i 2 2 2 

that is, A -1 is the right half of the last matrix. Thus, 

r—16 -ii 3 i 


(b) Form the matrix M = [R, /] and row reduce M to echelon form: 


M = 1 

3 


3 -41 0 0 

5 -1,0 1 0 

13 -6 i 0 0 1 


1 3 -4 


1 0 0 


3,-1 1 0 

6 i -3 0 1 


-4 ' 1 00 

3 [ -1 10 

0 i -1 -2 1 


In echelon form, M has a zero row in its left half; that is, B is not row reducible to triangular form. 
Accordingly, B has no inverse. 
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3.33. Show that every elementary matrix E is invertible, and its inverse is an elementary matrix. 

Let E be the elementary matrix corresponding to the elementary operation e\ that is, e(I) = E. Let e' be the 
inverse operation of e and let E' be the corresponding elementary matrix; that is, e'(I) = E'. Then 

I=e\e{I)) = e\E)=E'E and / = e(e'(I)) = e{E?) = EE' 

Therefore, E' is the inverse of E. 


3.34. Prove Theorem 3.16: Let e be an elementary row operation and let E be the corresponding m-square 
elementary matrix; that is, E = <?(/). Then e(A) = EA, where A is any m x n matrix. 

Let Rj be the row i of A; we denote this by writing A = [R,,... ,R m \- If B is a matrix for which AB is 
defined then AB = [RjB.R m B], We also let 

e i= (0,0,1,0, ...,0 ), ' = ( 

Here ~ = i means 1 is the fth entry. One can show (Problem 2.45) that e i A = R i . We also note that 
I = [ei, e 2 , ■ ■ ., e m ] is the identity matrix. 

(i) Let e be the elementary row operation "Interchange rows R, and R.A Then, for ~ = i and ' = j, 

E=e(I) = [e li .,.,e j ,...,e i ,...,ej 
and 


e(A) = [R I ,...,R,...,R„...,RJ 


Thus, 


EA = [e t A, ...,ejA,..., e t A,e m A\ = [R x ,... ,Rj,... ,R h ..., = e(A) 


(ii) Let e be the elementary row operation "Replace R, by kR ; (k 0).” Then, for * = i, 

E = e(I) = [e u ...,ke i ,---,e J 
and 


Thus, 


e(A) = [R l ,...,kR i ,...,R m \ 


EA = [ ei A,..., ke,A, ..., e m A\ = [R,,..., kR h R„,] = e{A) 


(iii) Let e be the elementary row operation "Replace R, by LR ; - + R ; .” Then, for' = i, 

E = e(I) = [e u ..., kej + e h ..., ej 

and 


e(A) = [Rj, ..., kRj + R h ..., R,„] 

Using (kej + e,)A = k(ejA) + e,A = kRj + R,, we have 

EA [cjA, • • ■ i {kc j j T e,-)A, ■ • -, e m Ai\ 

= [Ri, ..., kRj + R h ..., R,„] = e(A) 


3.35. Prove Theorem 3.17: Let A be a square matrix. Then the following are equivalent: 

(a) A is invertible (nonsingular). 

(b) A is row equivalent to the identity matrix I. 

(c) A is a product of elementary matrices. 

Suppose A is invertible and suppose A is row equivalent to matrix B in row canonical form. Then there 
exist elementary matrices E l ,E 2 ,... ,E S such that E s .. .E 2 E\A = B. Because A is invertible and each 
elementary matrix is invertible, B is also invertible. But if B A- /, then B has a zero row; whence B is not 
invertible. Thus, B = /, and (a) implies (b). 
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If (b) holds, then there exist elementary matrices E l ,E 2 ,... ,E S such that E s ... E 1 E l A = I. Hence, 
A = (E s .. .E 2 E\Y x = Zsf'liJ 1 ... E~ l . But the Ef l are also elementary matrices. Thus (b) implies (c). 

If (c) holds, then A = E l E 2 ■ ■ -E s . The E t are invertible matrices; hence, their product A is also invertible. 
Thus, (c) implies (a). Accordingly, the theorem is proved. 


3.36. Prove Theorem 3.18: If AB = I, then BA = /, and hence B = A 1 . 

Suppose A is not invertible. Then A is not row equivalent to the identity matrix 7, and so A is row 
equivalent to a matrix with a zero row. In other words, there exist elementary matrices ,E S such 

that E s . . ,E 2 E x A has a zero row. Hence, E s .. ,E 2 E X AB = E s .. ■E 2 E l , an invertible matrix, also has a 
zero row. But invertible matrices cannot have zero rows; hence A is invertible, with inverse A -1 . Then 
also, 

B = IB= ( A~ l A)B = A -1 (AB) = A~ l I = A -1 

3.37. Prove Theorem 3.19: B is row equivalent to A (written B ~ A) if and only if there exists a 
nonsingular matrix P such that B = PA. 

If S ~ A, then B = e s (... ( e 2 (e l (A)))...) = E s ... E 2 E X A = PA where P = E s ... E 2 E 1 is nonsingular. 
Conversely, suppose B = PA, where P is nonsingular. By Theorem 3.17, P is a product of elementary 
matrices, and so B can be obtained from A by a sequence of elementary row operations; that is, B ~ A. Thus, 
the theorem is proved. 


3.38. Prove Theorem 3.21: Every m x n matrix A is equivalent to a unique block matrix of the form 
0 , where I r is the r x r identity matrix. 

The proof is constructive, in the form of an algorithm. 

Step 1. Row reduce A to row canonical form, with leading nonzero entries a lh , a 2 j 2 , ■ ■ ■, a rh . 

Step 2. Interchange Cj and Cy , interchange C 2 and C 2 j 2 , ■ ■ •, and interchange C r and Cj r . This gives a 


matrix in the fomi 


I r 1 B 

'-d-: o- 


, with leading nonzero entries a jj, a 22 ,..., a rr . 


Step 3. Use column operations, with the a u as pivots, to replace each entry in B with a zero; that is, for 


i = 1,2and j = r + 1, r + 2,... ,n, apply the operation — h„C,- + C, 


The final matrix has the desired form 


a_;_° 

0 I 0 


C r 


Lu Factorization 



'1-3 5' 


1 

Ck) 

A = 

2-4 7 
-1 -2 1 

,(b) B = 

2 8 1 

-5 -9 7 


(a) Reduce A to triangular form by the following operations: 


“Replace R 2 by — 2R { + R 2 ,” “Replace A’, by + R 2 , ” and then 
“Replace 7?, by f R 2 + R{' 

These operations yield the following, where the triangular form is U: 


'1 

-3 

5' 


'1 

-3 

5' 




1 

0 

o' 

0 

2 

-3 


0 

2 

-3 

= u 

and 

L = 

2 

1 

0 

0 

-5 

6 


0 

0 

3 

2. 




-1 

5 

2 

1 


The entries 2, — 1, — 4 in L are the negatives of the multipliers —2,1, | in the above row operations. (As a 
check, multiply L and U to verify A = LU.) 
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(b) Reduce B to triangular form by first applying the operations “Replace R 2 by — 2R l + R 2 ” and “Replace R 3 
by 5 R x + R 3 .” These operations yield 


B~ 


1 4 -3' 

0 0 7 

0 11 -8 


Observe that the second diagonal entry is 0. Thus, B cannot be brought into triangular form without row 
interchange operations. Accordingly, B is not /.{/-factorable. (There does exist a PLU factorization of 
such a matrix B , where P is a permutation matrix, but such a factorization lies beyond the scope of this 
text.) 


3.40. Find the LDU factorization of the matrix A in Problem 3.39. 


The A = LDU factorization refers to the situation where L is a lower triangular matrix with l’s on the 
diagonal (as in the LU factorization of A), D is a diagonal matrix, and U is an upper triangular matrix with l’s on 
the diagonal. Thus, simply factor out the diagonal entries in the matrix U in the above LU factorization of A to 
obtain D and L. That is, 



1 

0 

o' 


'1 

0 

0 “ 


"i 

-3 

5' 

L = 

2 

1 

0 

, D = 

0 

2 

0 

t u = 

0 

1 

-3 


-1 

5 

2 

i 


0 

0 

3 

2 _ 


0 

0 

1 


3.41. Find the LU factorization of the matrix A 


1 2 1 

2 3 3 

-3 -10 2 


Reduce A to triangular form by the following operations: 

(1) “Replace R 2 by — 2R X + R 2 , ” (2) “Replace R 3 by 3R + R 3 , ” (3) “Replace R, by —4R 2 + R 3 ” 

These operations yield the following, where the triangular form is U: 


'1 

2 

r 


'1 

2 

r 




1 

0 

0" 

0 

-1 

i 

~ 

0 

-1 

i 

= u 

and 

L = 

2 

1 

0 

0 

-4 

5 


0 

0 

i 




-3 

4 

1 


The entries 2, —3,4 in L are the negatives of the multipliers —2,3, —4 in the above row operations. (As a 
check, multiply L and U to verify A = LU.) 


3.42. Let A be the matrix in Problem 3.41. Find A,, X 2 , X 3 , where X i is the solution of AX — B l for 

(a) B x = (1,1,1), (b) B 2 =B l +X u (c) B 3 = B 2 +X 2 . 

(a) Find L~ l B x by applying the row operations (1), (2), and then (3) in Problem 3.41 to B l : 


T 

1 

(1) and (2) 

r 

-i 

(3) 

r 

-i 



1 


4 


8 


Solve UX = B for B = (1, —1, 8) by back-substitution to obtain X x = (—25, 9, 8). 

(b) First find B 2 = B x +X x = (1,1,1) + (—25,9,8) = (—24,10,9). Then as above 

, , 7 - (1) and (2) , (3) 

B 2 = [-24,10, 9] t — -u [-24,58, -63] r - - —> [-24,58, -295] r 

Solve UX = B for B = (—24,58, —295) by back-substitution to obtain X 2 = (943, —353, —295). 

(c) First find B 3 = B 2 +X 2 = (-24,10, 9) + (943, -353, -295) = (919, -343, -286). Then, as above 

r , 7 - (1) and (2) r (3) r , 7 - 

B } = [943, —353, —295f —-> [919, -2181,2671] r --- l [919,-2181,11395] r 

Solve UX — B for B = (919, —2181.11 395) by back-substitution to obtain 


X, = (-37 628,13 576,11395). 
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Miscellaneous Problems 

3.43. Let L be a linear combination of the m equations in n unknowns in the system (3.2). Say L is the 
equation 

( c i a n + • • • + c m a ml )x i + • • • + (■ Cia ln + ■ ■ ■ + c m a mn )x„ — c x b x +- 1- c m b m (1) 

Show that any solution of the system (3.2) is also a solution of L. 

Let u — k n ) be a solution of (3.2). Then 

a i\ k \ + a i2 k i H-f a in k n = bj (i = 1 , 2 ,..., m) ( 2 ) 

Substituting u in the left-hand side of (1) and using (2), we get 

( c l fl ll 3- + c m a ,n\) k l + '•• + ( c l«l n H-b c m a mn) k n 

= -I-h a ln k n ) + • • • + +-b a mn k n ) 

= c \b\ H-f c,„b m 

This is the right-hand side of (1); hence, u is a solution of (1). 

3.44. Suppose a system Jt of linear equations is obtained from a system '£ by applying an elementary 
operation (page 64). Show that Jt and if have the same solutions. 

Each equation L in Jt is a linear combination of equations in if. Hence, by Problem 3.43, any solution of 
if will also be a solution of Jt. On the other hand, each elementary operation has an inverse elementary 
operation, so if can be obtained from Jt by an elementary operation. This means that any solution of Jt is a 
solution of if. Thus, if and jit have the same solutions. 

3.45. Prove Theorem 3.4: Suppose a system jtt of linear equations is obtained from a system if by a 
sequence of elementary operations. Then Jt and if have the same solutions. 

Each step of the sequence does not change the solution set (Problem 3.44). Thus, the original system if 
and the final system Jt (and any system in between) have the same solutions. 

3.46. A system if of linear equations is said to be consistent if no linear combination of its equations is a 
degenerate equation L with a nonzero constant. Show that if is consistent if and only if if is 
reducible to echelon form. 

Suppose if is reducible to echelon form. Then if has a solution, which must also be a solution of every 
linear combination of its equations. Thus, L, which has no solution, cannot be a linear combination of the 
equations in if. Thus, if is consistent. 

On the other hand, suppose if is not reducible to echelon fomi. Then, in the reduction process, it must 
yield a degenerate equation L with a nonzero constant, which is a linear combination of the equations in if. 
Therefore, if is not consistent; that is, if is inconsistent. 

3.47. Suppose u and v are distinct vectors. Show that, for distinct scalars k, the vectors u + k(u — v ) are 
distinct. 

Suppose u + k l (u — v) = u + k 2 (u — v). We need only show that k x = k 2 . We have 
k x (u — v) = k 2 (u — v), and so (k l — k 2 )(u — v) = 0 
Because u and v are distinct, u — v 0. Hence, k x — k 2 = 0, and so k x = k 2 . 

3.48. Suppose AB is defined. Prove 

(a) Suppose A has a zero row. Then AB has a zero row. 

(b) Suppose B has a zero column. Then AB has a zero column. 
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(a) Let R , be the zero row of A, and C l ,..., C n the columns of B. Then the ith row of AB is 

(R i C u R l C 2 ,...,R,C n ) = (0.0.0.0) 

(b) B t has a zero row, and so B T A T = ( AB) T has a zero row. Hence, AB has a zero column. 


SUPPLEMENTARY PROBLEMS 


Linear Equations, 2x2 Systems 

3.49. Determine whether each of the following systems is linear: 

(a) 3x — 4y + 2 yz = 8 , (b) ex + 3y = n, (c) 2x — 3y + kz = 4 

3.50. Solve (a) nx = 2, (b) 3x + 2 = 5x + 7 — 2x, (c) 6 x + 2 — 4x = 5 + 2x — 3 

3.51. Solve each of the following systems: 

(a) 2x + 3 y = 1 (b) 4x — 2y = 5 (c) 2x — 4 = 3y (d) 2x — 4_v = 10 

5x + ly = 3 —6x + 3y = 1 5y — x = 5 3x — 6 y = 15 

3.52. Consider each of the following systems in unknowns x and y: 

(a) x — ay = 1 (b) ax + 3y = 2 (c) x + ay = 3 

ax — 4y = b 12x + ay = b 2x + 5y = b 

For which values of a does each system have a unique solution, and for which pairs of values (a, b) does each 

system have more than one solution? 

General Systems of Linear Equations 

3.53. Solve 

(a) x+ y+ 2z= 4 (b) x - 2y + 3z = 2 (c) x + 2y + 3z = 3 

2x+3y+ 6^=10 2x — 3v + 8 z = 7 2x+3y+ 8z = 4 

3x + 6y+ lOz = 17 3x — 4y + 13z = 8 5x + 8 y +19z = 11 

3.54. Solve 

(a) x — 2y=5 (b) x + 2y — 3z+2t = 2 (c) x + 2y + 4z — 5t = 3 

2x + 3y = 3 2x + 5y — 8 z + 6 r = 5 3x — y + 5z + 2t = 4 

3x + 2y = 1 3x + 4y — 5z + 2f = 4 5x — 4y + 6z + 9r = 2 

3.55. Solve 

(a) 2x— y - 4z = 2 (b) x + 2y-z+3t=3 

4x — 2y — 6z = 5 2x + 4y + 4z + 3t = 9 

6x — 3y — 8 z = 8 3x + 6 y — z + 8 r = 10 

3.56. Consider each of the following systems in unknowns x,y,z: 

(a) x — 2y =1 (b) x + 2y + 2z = 1 (c) x + y + az = 1 

x — y + az = 2 x + ay + 3z = 3 x + ay + z = 4 

ay + 9z = b x + 11 y + az= b ax + y + z= b 


For which values of a does the system have a unique solution, and for which pairs of values (a, b) does the 
system have more than one solution? The value of b does not have any effect on whether the system has a 
unique solution. Why? 
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Linear Combinations, Homogeneous Systems 

3.57. Write v as a linear combination of u t , u 2 ,u 3 , where 

(a) it =(4,-9,2), Mi = (1,2,-1), u 2 = (1,4,2), w 3 = (1, —3,2); 

(b) «= (1,3,2), «! = (1,2,1), m 2 = (2,6,5), 113 = (1,7,8); 

(c) n= (1,4,6), Mi = (1,1,2), 1*2 = (2,3,5), m 3 = (3,5,8). 

3.58. Let Mj = (1,1,2), m 2 = (1, 3, —2), m 3 = (4, —2, — 1) in R 3 . Show that w 1; m 2 , m 3 are orthogonal, and write v as 
a linear combination of u x ,u 2 ,u 2 , where (a) v = (5, —5,9), (b) v = (1, —3, 3), (c) v = (1,1,1). 

(Hint: Use Fourier coefficients.) 

3.59. Find the dimension and a basis of the general solution W of each of the following homogeneous systems: 

(a) x — y + 2z = 0 (b) x + 2y — 3z = 0 (c) x + 2_v + 3z + t = 0 

2x + y+z = 0 2x + 5y + 2z = 0 2x + 4y + 7z + At = 0 

5x + y + 4z = 0 3x— y — 4z = 0 3* + 6 y + lOz + 5f = 0 

3.60. Find the dimension and a basis of the general solution W of each of the following systems: 

(a) x l + 3*2 + 2 * 3 — x 4 — *5 = 0 (b) 2 x 1 — 4x 2 + 3 * 3 — * 4 + 2 * 5 = 0 

2*j + 6*2 + 5x 3 + *4 — *5 = 0 3*! — 6*2 + 5x 3 — 2*4 + 4*5 = 0 

5*! + 15*2 + 12*3 + *4 — 3*5 = 0 5*! — 10*2 + 7*3 — 3*4 + 18*5 = 0 


Echelon Matrices, Row Canonical Form 

3.61. Reduce each of the following matrices to echelon form and then to row canonical form: 

"1 12] [12-12 1] [2 4 2 -2 5 l" 

(a) 2 4 9, (b) 2 4 1-2 5, (c) 3 6 2 2 0 4 

1 5 12 J |_3 6 3 “7 7 J L 4 8 2 6 “ 5 7 

3.62. Reduce each of the following matrices to echelon form and then to row canonical form: 

' 121212 ] [ 0123 ] [ 1313 ' 

243557 038 12 285 10 

(a) 3 6 4 9 10 11 ’ (b) 0 0 4 6’ (C) 1 7 7 11 

1 2 4 3 6 9j [0 2 7 loj [3 11 7 15 


3.63. Using only 0’s and l’s, list all possible 2x2 matrices in row canonical form. 

3.64. Using only 0's and l’s, find the number n of possible 3x3 matrices in row canonical form. 


Elementary Matrices, Applications 

3.65. Let ei,e 2 ,e 2 denote, respectively, the following elementary row operations: 

“Interchange R 2 and R 3 , ” "Replace R 2 by 3/L,” “Replace/?] by2/? 3 +/?]” 

(a) Find the corresponding elementary matrices E l ,E 2 ,E 2 . 

(b) Find the inverse operations e^ 1 , e 2 l , ej 1 ; their corresponding elementary matrices E \, E 2 , £ 3 ; and the 
relationship between them and E l ,E 2 ,E 3 . 

(c) Describe the corresponding elementary column operations/],/ 2 ,/ 3 . 

(d) Find elementary matrices F l ,F 2 ,F 3 corresponding to/],/ 2 ,/ 3 , and the relationship between them and 
E\,E 2 ,E 2 . 
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3.66. Express each of the following matrices as a product of elementary matrices: 










"1 

2 

0 " 

'1 2 

, B = 

3 

—6 

, C = 

2 

6 

, D = 

1 




0 

3 

3 4 

-2 

4 

-3 

-7 










_3 

8 

7 


3.67. Find the inverse of each of the following matrices (if it exists): 


'1 -2 - 1 ' 


'1 2 3 ' 


"1 3 - 2 ' 


'2 i - r 

2 -3 1 

3-4 4 

B = 

2 6 1 

3 10 -1 

, c = 

2 8-3 

1 7 1 

, D = 

5 2-3 

0 2 1 


3.68. Find the inverse of each of the following n x n matrices: 

(a) A has l’s on the diagonal and superdiagonal (entries directly above the diagonal) and 0's elsewhere. 

(b) B has l’s on and above the diagonal, and 0’s below the diagonal. 

Lu Factorization 

3.69. Find the LU factorization of each of the following matrices: 


"i -i -r 


'1 3 -1" 


'2 3 6' 


CO 

(N 

3 -4 -2 

2 -3 -2 

,(b) 

2 5 1 

3 4 2 

. (c) 

4 7 9 

3 5 4 

. (d) 

2 4 7 

3 7 10 


3.70. Fet A be the matrix in Problem 3.69(a). Find X ll X 2 ,X 3 ,X 4 , where 

(a) X 1 is the solution of AX = B x , where B x = (1,1, l) r . 

(b) For k > 1, X k is the solution of AX = B k , where B k = B k _ x + X k _ x . 

3.71. Fet B be the matrix in Problem 3.69(b). Find the LDU factorization of B. 

Miscellaneous Problems 

3.72. Consider the following systems in unknowns x and y: 

, . ax + by = 1 , . ax + by = 0 

^ cx + dy = 0 cx + dy = 1 

Suppose D = ad — be ^ 0. Show that each system has the unique solution: 

(a) x = d/D, y = —c/D, (b) x = —b/D, y = a/D. 

3.73. Find the inverse of the row operation “Replace R, by kRj + k'Rj ( k’ ^ 0).” 

3.74. Prove that deleting the last column of an echelon form (respectively, the row canonical form) of an augmented 
matrix M = [A, 8 ] yields an echelon form (respectively, the row canonical form) of A. 

3.75. Fet e be an elementary row operation and E its elementary matrix, and let / be the corresponding elementary 
column operation and F its elementary matrix. Prove 

(a) /(A) = (e(A T )) T , (b) F = E T , (c) f(A)=AF. 

3.76. Matrix A is equivalent to matrix B, written A w B, if there exist nonsingular matrices P and Q such that 
B = PAQ. Prove that w is an equivalence relation; that is, 


(a) Aw A, (b) If A w S, then S w A, (c) If A w B and B w C, then A w C. 
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ANSWERS TO SUPPLEMENTARY PROBLEMS 


Notation: A = [7?j; R 2 ; ...] denotes the matrix A with rows R i , R 2 ,.... The elements in each row are separated 

by commas (which may be omitted with single digits), the rows are separated by semicolons, and 0 denotes a zero 
row. For example, 

'1 2 3 4' 

5-67-8 
0 0 0 0 


A = [1,2,3,4; 5,-6,7,-8; 0] = 


3.49. 

(a) 

no, (b) 

yes, 

(c) linear in x, y, z., not linear in x, y, z, k 

3.50. 

(a) 

X = 2/71, 

(b) 

no solution. 

(c) every scalar k is a solution 

3.51. 

(a) 

( 2 ,- 1 ), 

(b) 

no solution, 

(c) (5,2), (d) (5 + 2 a, a) 

3.52. 

(a) 

a ^ ± 2 , 

( 2 , 2 ), 

(- 2 ,- 2 ), 

(b) a ^ ± 6 , (6,4), (-6,-4), (c) a + \ 

3.53. 

(a) 

( 2 , 1 ,|), 

(b) 

no solution, 

(c) u = (—la—\, 2a+ 2, a). 

3.54. 

(a) 

(3,-1), 

(b) 

u = (— a + 2b, 1 + 2 a — 2b, a, b ), (c) no solution 

3.55. 

(a) 

u = (|fl + 2 , o, 

|), (b) 

u = (|(7 — 5b — 4a), a, |(1 +b), b) 

3.56. 

(a) 

a ^ ±3, 

(3,3), 

(-3,-3), 

(b) o/5and«/-l, (5,7), (-1,-5), 


(c) 

a ^ 1 and 

Cl -f— . 

2, (-2,5) 


3.57. 

(a) 

2,-1,3, 

(b) 

6 ,-3,1, 

(c) not possible 

3.58. 

(a) 

3,-2,1. 

(b) 

2 _1 1 

3 , 1 , 3 > 

(c) ’11 

W 3 H ’21 


3.59. (a) dim IT = 1, wj = (—1.1.1), (b) dim W = 0, no basis, 

(c) dim W = 2, u x = (—2,1,0,0), u 2 = (5,0, — 2,1) 

3.60. (a) dim IT = 3, u r = (-3,1,0,0,0), w 2 = (7,0, —3,1,0), k 3 = (3,0,-1,0,1), 

(b) dim IT = 2, u x = (2,1,0,0,0), u 2 = (5,0, -5, -3,1) 

3.61. (a) [1,0,-i; 0,1,|; 0], (b) [1,2,0,0,2; 0,0,1,0,5; 0,0,0,1,2], 

(c) [1,2,0,4, -5,3; 0,0,1,-5,f, - §; 0] 

3.62. (a) [1,2,0,0,-4,-2; 0,0,1,0,1,2; 0,0,0,1.2,1; 0], 

(b) [0,1,0,0; 0,0,1,0; 0,0,0, 1; 0], (c) [1,0,0,4; 0,1,0,-1; 0,0, 1,2; 0] 

3.63. 5: [1,0; 0,1], [1,1; 0,0], [1,0; 0,0], [0,1; 0,0],0 

3.64. 16 

3.65. (a) [1,0,0; 0,0,1; 0,1,0], [1,0,0; 0,3,0; 0,0, 1], [1,0, 2; 0.1,0; 0,0,1], 

(b) R 2 <-> Ry, l /? 2 “ 7 ^ 2 ? —2 /? 3 + R l —>/?,; each E\ = £)■ *, 

(c) C 2 <-> C 3 , 3C 2 -> C 2 , 2C 3 + C x -*■ C 1; (d) each F t = E]. 

3.66. A = [1,0; 3,1][1,0; 0,-2][l , 2 ; 0 , 1 ], S is not invertible, 

C = [1,0; —1][1,0; 0,2][1, 6 ; 0,1][2,0; 0,1], 

D = [100; 010; 301] [100; 010; 021][100; 013; 001] [120; 010; 001] 

3.67. A -1 = [— 8 ,12,—5; —5,7,—3; 1. • 2. 11, B has no inverse, 

C-> = [ H§ 3,-2,1], = [ 8 ,-3,-1; -5,2.1; 10,-4,-1] 
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3.68. A- 1 =[1,-1,1,-1,...; 0,1,-1,1,-1,...; 0,0,1,-1,1,-1,1,... 

B x has l’s on diagonal, —l’s on superdiagonal, and 0’s elsewhere. 

3.69. (a) [100; 310; 211][1,-1,-1; 0,-1, 1; 0,0,-1], 

(b) [100; 210; 351][1.3, —1; 0,-1,3; 0,0,-10], 

(c) [100; 210; |,1,1][2,3,6; 0,1,-3; 0,0,—g. 

(d) There is no LU decomposition. 

3.70. X 1 = [l.l,-l] r , B 2 = [ 2,2,Of, X 2 = [ 6 ,4,0] r , S 3 = [8,6,0] r , 21 
B 4 = [30,22, —2] r , X 4 = [86,62, - 6 ] r 

3.71. B = [100; 210; 351] diag( 1, — 1, —10) [1,3,—1; 0,1,3; 0,0,1] 

3.73. Replace /?, by -kRj + (I /k')R r 

3.75. (c) /(A) = ( e(A T )) T = {.EA T ) T = ( A T ) T E T = AF 

3.76. (a) A = IAI. (b) If A = PBQ, then B = P~ x AQT l . 

(c) If A = PBQ and B = P'CQ ', then A = ( PP')C{Q'Q ). 


0 ,•• • 0 , 1 ] 


3 = [ 22 , 16 , - 2 } t , 




Vector Spaces 


4.1 Introduction 


This chapter introduces the underlying structure of linear algebra, that of a finite-dimensional vector 
space. The definition of a vector space V, whose elements are called vectors, involves an arbitrary field 
K, whose elements are called scalars. The following notation will be used (unless otherwise stated or 
implied): 


V 

U, V, w 

K 

a, b, c, or k 


the given vector space 
vectors in V 
the given number field 
scalars in K 


Almost nothing essential is lost if the reader assumes that K is the real field R or the complex field C. 

The reader might suspect that the real line R has “dimension” one, the cartesian plane R 2 has 
“dimension” two, and the space R 3 has “dimension” three. This chapter formalizes the notion of 
“dimension,” and this definition will agree with the reader’s intuition. 

Throughout this text, we will use the following set notation: 


a E A 

Element a belongs to set A 

a,b E A 

Elements a and b belong to A 

Vx E A 

For every x in A 

3x E A 

There exists an x in A 

ACB 

A is a subset of B 

ADB 

Intersection of A and B 

AUB 

Union of A and B 

0 

Empty set 


4.2 Vector Spaces 


The following defines the notion of a vector space V where K is the field of scalars. 

DEFINITION 4.1: Let V be a nonempty set with two operations: 

(i) Vector Addition: This assigns to any u, v E V a sum u + v in V. 

(ii) Scalar Multiplication: This assigns to any u £ V, k £ K a product ku E V. 

Then V is called a vector space (over the field K) if the following axioms hold for 
any vectors u,v,wE V: 
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[A,] (u + v) + w = it + (v + w) 

[A 2 ] There is a vector in V, denoted by 0 and called the zero vector, such that, for any 
u C V, 

u+0=0+u=u 

[A 3 ] For each u G V, there is a vector in V, denoted by — u, and called the negative of u, 
such that 

u + {—u) = (—u) + u = 0. 

[A 4 ] u + v — v + u. 

[MJ k(u + v) =ku + kv, for any scalar k G K. 

[M 2 ] ( a + b)u = au + bit, for any scalars a, b G K. 

[M 3 ] ( ab)u = a(bu), for any scalars a,b G K. 

[M 4 ] 1 u — u, for the unit scalar I G K. 

The above axioms naturally split into two sets (as indicated by the labeling of the axioms). The first four 
are concerned only with the additive structure of V and can be summarized by saying V is a commutative 
group under addition. This means 

(a) Any sum V\ + v 2 + ■ ■ • + v m of vectors requires no parentheses and does not depend on the order of 
the summands. 

(b) The zero vector 0 is unique, and the negative —u of a vector u is unique. 

(c) (Cancellation Law) If u + w = v + w, then u = v. 

Also, subtraction in V is defined by u — v = u + ( — v), where — v is the unique negative of v. 

On the other hand, the remaining four axioms are concerned with the “action” of the field K of scalars 
on the vector space V. Using these additional axioms, we prove (Problem 4.2) the following simple 
properties of a vector space. 

THEOREM 4.1: Let V be a vector space over a field K. 

(i) For any scalar k G K and 0 G V, k() = 0. 

(ii) For 0 G K and any vector u G V. 0 u = 0. 

(iii) If ku — 0, where k G K and u G V. then k = 0 or a = 0. 

(iv) For any k G K and any u G V, (—k)u = k(—u ) = —ku. 


4.3 Examples of Vector Spaces 


This section lists important examples of vector spaces that will be used throughout the text. 

Space K n 

Let K be an arbitrary field. The notation K" is frequently used to denote the set of all u-tuples of elements 
in K. Here K” is a vector space over K using the following operations: 

(i) Vector Addition: {a x , a 2 , ■ ■ ■, a n ) + (b { , b 2 ,. • •, b u ) — (a l + b l: a 2 + b 2 , ■ ■ ■, a n + b n ) 

(ii) Scalar Multiplication: k(a u a 2 ,..., a n ) = (ka t . ka 2 ,..., ka n ) 

The zero vector in K" is the /(-tuple of zeros, 

0 = ( 0 , 0 ,... , 0 ) 

and the negative of a vector is defined by 

(r/j, a 2 ,..., u n ) ( rq, a 2 , •. •, a n ) 

Observe that these are the same as the operations defined for R" in Chapter 1. The proof that K" is a vector 
space is identical to the proof of Theorem 1.1, which we now regard as stating that R" with the operations 
defined there is a vector space over R. 
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Polynomial Space P(t ) 

Let P(r) denote the set of all polynomials of the form 

p{t) = Uq Clyt Cl 2 ^ Cl s t (s — 1,2, . . .) 

where the coefficients a i belong to a field X. Then P(f) is a vector space over X using the following operations: 

(i) Vector Addition: Here p(t) + q(t) in P(f) is the usual operation of addition of polynomials. 

(ii) Scalar Multiplication: Here kp(t) in P(/ is the usual operation of the product of a scalar k and a 
polynomial p(t). 

The zero polynomial 0 is the zero vector in P (t). 

Polynomial Space P„(f) 

Let P„(f) denote the set of all polynomials pit) over a field X, where the degree of pi t) is less than or equal 
to n; that is, 

p(t) = a 0 + a 1 t + a 2 t 2 + ■ ■ ■ + a s f 

where s < n. Then P n (t) is a vector space over X with respect to the usual operations of addition of 
polynomials and of multiplication of a polynomial by a constant (just like the vector space P(f) above). We 
include the zero polynomial 0 as an element of P n {t), even though its degree is undefined. 

Matrix Space M m n 

The notation M mn , or simply M, will be used to denote the set of all m x n matrices with entries in a field 
X. Then M mn is a vector space over X with respect to the usual operations of matrix addition and scalar 
multiplication of matrices, as indicated by Theorem 2.1. 

Function Space F(X ) 

Let X be a nonempty set and let K be an arbitrary field. Let F(X) denote the set of all functions of X into K. 
[Note that F(X) is nonempty, because X is nonempty.] Then F(X) is a vector space over K with respect to 
the following operations: 

(i) Vector Addition: The sum of two functions/' and g in F(X) is the function / + g in F(X) defined by 

(/+#)(*)= /(*) + six) Vx G X 

(ii) Scalar Multiplication: The product of a scalar k € K and a function f in FIX) is the function kf in 
F(X) defined by 

(¥)(x) = ¥(*) Vx € X 

The zero vector in F(X) is the zero function 0, which maps every x E X into the zero element 0 € X; 
0(x) = 0 Vx e X 

Also, for any function / in F(X), negative of/ is the function —f in F(X) defined by 
= ~f(x) VxGl 

Fields and Subfields 

Suppose a field E is an extension of a field X; that is, suppose £ is a field that contains X as a subfield. 
Then E may be viewed as a vector space over X using the following operations: 

(i) Vector Addition: Here it + v in E is the usual addition in E. 

(ii) Scalar Multiplication: Here ku in E, where k £ X and u € E, is the usual product of k and u as 
elements of E. 

That is, the eight axioms of a vector space are satisfied by E and its subfield X with respect to the above 
two operations. 
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4.4 Linear Combinations, Spanning Sets 

Let V be a vector space over a field K. A vector v in V is a linear combination of vectors u ,, u 2 ,.... u m in 
V if there exist scalars a { ,a 2 ,... ,a„, in K such that 

v = «! i/! + a 2 u 2 + ■ ■ ■ + a m u m 

Alternatively, v is a linear combination of u ,. u 2 ..... u m if there is a solution to the vector equation 

V = x l u l +x 2 u 2 H- \-x m u m 

where x l ,x 2 ,... ,x m are unknown scalars. 


EXAMPLE 4.1 (Linear Combinations in R") Suppose we want to express v — (3,7,—4) in R 3 as a linear 
combination of the vectors 


«t = (1,2,3), u 2 = (2,3,7), m 3 = (3,5,6) 


We seek scalars x, y, z such that v = xiq + yu 2 + zuy, that is, 


3' 


T 


'2' 


'3' 

x + 2y + 3z = 3 

7 

= X 

2 

+ .v 

3 

+ z 

5 

or 2x + 3y + 5z = 7 

-4 


3 


7 


6 

3x + ly + 6z = —A 


(For notational convenience, we have written the vectors in R 3 as columns, because it is then easier to find the 
equivalent system of linear equations.) Reducing the system to echelon form yields 


x + 2y + 3z = 3 

—y — z= 1 and then 

y 3z = 13 


x + 2y + 3z — 3 

—y - z= 1 

-4z= —12 


Back-substitution yields the solution x = 2, y = —4, z = 3. Thus, v = 2u l — 4u 2 + 3n 3 . 


Remark: Generally speaking, the question of expressing a given vector v in K" as a linear 
combination of vectors u l ,u 2 ,, u m in K" is equivalent to solving a system AX = B of linear equations, 
where v is the column B of constants, and the n’s are the columns of the coefficient matrix A. Such a 
system may have a unique solution (as above), many solutions, or no solution. The last case—no 
solution—means that v cannot be written as a linear combination of the u’s. 

EXAMPLE 4.2 (Linear combinations in P(f)) Suppose we want to express the polynomial v = 3t 2 + 5t — 5 as a 
linear combination of the polynomials 

P\ = t~ T 21 T 1, pi — 21 ~\~ 5f T 4, p 2 — t T 3t T 6 

We seek scalars x, y, z such that v = xpi + yp 2 + zp 2 , that is, 

3r + 5t — 5 = x(t 2 + 2t + 1) + y{21 2 + 5t + 4) + z(t 2 + 3t + 6 ) (*) 

There are two ways to proceed from here. 

(1) Expand the right-hand side of (*) obtaining: 

3t 2 + 5t — 5 = xt 2 + 2xt + x + 2yt 2 + 5 yt + 4 y + zt 2 + 3zt + 6 z 
— (x + 2y + z)t 2 + (2x + + 3 z)t + (x + 4 y + 6z) 

Set coefficients of the same powers of t equal to each other, and reduce the system to echelon form: 

x + 2y + z = 3 x + 2y + z = 3 x + 2y + z = 3 

2x + 5y + 3z = 5 or y + z = — 1 or y + z = — 1 

x + Ay + 6z = —5 2y + 5z = —8 3 z = —6 
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The system is in triangular form and has a solution. Back-substitution yields the solution x = 3, y = 1, z = —2. 
Thus, 

v = 3 p l + p 2 - lp 3 

(2) The equation (*) is actually an identity in the variable f; that is, the equation holds for any value 
of t. We can obtain three equations in the unknowns x, y, z by setting t equal to any three values. 
For example. 

Set t = 0 in (1) to obtain: x+ Ay + 6z = — 5 

Set t = 1 in (1) to obtain: Ax + 1 ly + 10r = 3 

Set t — — 1 in (1) to obtain: y + Az = —1 

Reducing this system to echelon form and solving by back-substitution again yields the solution x = 3, y = 1, 
Z = —2. Thus (again), v = 3 p l + p 2 — 2 p 3 . 

Spanning Sets 

Let V be a vector space over K. Vectors u x , u 2 , ■ ■ ■, u m in V are said to span V or to form a spanning set of 
V if every v in V is a linear combination of the vectors u l ,u 2 , ■ ■ ■ ,u m —that is, if there exist scalars 
a 1 , a 2 , ■ ■ ■, a m in K such that 

v = a l u l + a 2 u 2 + • • • + a m u m 

The following remarks follow directly from the definition. 

Remark 1: Suppose u l ,u 2 , ..., u m span V. Then, for any vector w, the set vv. w, ,u 2 . u m also 

spans V. 

Remark 2: Suppose u l ,u 2 ,, u m span V and suppose u k is a linear combination of some of the 
other u’ s. Then the u’s without u k also span V. 

Remark 3: Suppose u x . u 2 ..... u m span V and suppose one of the u’s is the zero vector. Then the u’s 
without the zero vector also span V. 

EXAMPLE 4.3 Consider the vector space V = R 3 . 

(a) We claim that the following vectors form a spanning set of R 3 : 

e\ = ( 1 , 0 , 0 ), e 2 = ( 0 , 1 , 0 ), e 3 = ( 0 , 0 , 1 ) 

Specifically, if v = (a,b,c) is any vector in R 3 , then 
v = ae x + be 2 + ce 2 

For example, v = (5, — 6 ,2) = 5e x — 6e 2 + 2e 3 . 

(b) We claim that the following vectors also form a spanning set of R 3 : 

W t = (1,1,1), W 2 = (1,1,0), W 3 = (1,0,0) 

Specifically, if v = ( a,b,c ) is any vector in R 3 , then (Problem 4.62) 
v = (a, b, c ) = cwq + (b — c)w 2 + (a — b)w 3 
For example, v = (5, —6 ,2) = 2m.'! — 8 vr 2 + 1 lw 3 . 

(c) One can show (Problem 3.24) that v = (2,7, 8 ) cannot be written as a linear combination of the vectors 

wi = (1,2,3), u 2 = (1,3,5), m 3 = (1,5,9) 

Accordingly, u t , m 2 , u 3 do not span R 3 . 
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EXAMPLE 4.4 Consider the vector space V = P„(f) consisting of all polynomials of degree <n. 

(a) Clearly every polynomial in P„(f) can be expressed as a linear combination of the n + 1 polynomials 

] / / 3 / 3 f 1 

x , *'5 L 1 1 1 1 1 • 1 1 

Thus, these powers of t (where 1 = t°) form a spanning set for P„(t). 

(b) One can also show that, for any scalar c, the following n + 1 powers of t — c, 

1 , t-c, (t-cf , (t-c) 3 , (t-c) n 

(where (t — c)° = 1), also form a spanning set for P„(t). 


EXAMPLE 4.5 Consider the vector space M = M 22 consisting of all 2 x 2 matrices, and consider the following 
four matrices in M: 



"l 

o’ 


'o 

1 " 


'o 

o' 


'0 

o' 

^11 — 

0 

0 

) E \2 — 

0 

0 

> e 2[ — 

1 

0 

) ^22 — 

0 

1 


Then clearly any matrix A in M can be written as a linear combination of the four matrices. For example. 


A = 


5 -6 

7 8 


— 5E^ — 6 £ P T 1E 21 T 8 E- 


■22 


Accordingly, the four matrices E n , E l2 , E 2l , E 22 span M. 


4.5 Subspaces 


This section introduces the important notion of a subspace. 

DEFINITION 4.2: Let V be a vector space over a field K and let W be a subset of V. Then IT is a 

subspace of V if W is itself a vector space over K with respect to the operations of 
vector addition and scalar multiplication on V. 

The way in which one shows that any set W is a vector space is to show that W satisfies the eight 
axioms of a vector space. However, if IT is a subset of a vector space V, then some of the axioms 
automatically hold in W, because they already hold in V. Simple criteria for identifying subspaces follow. 

THEOREM 4.2: Suppose IT is a subset of a vector space V. Then W is a subspace of Vif the following 
two conditions hold: 

(a) The zero vector 0 belongs to IT 

(b) For every u,v£ W.k<E K: (i) The sum u + v G W. (ii) The multiple ku € IT 

Property (i) in (b) states that IT is closed under vector addition, and property (ii) in (b) states that W is 
closed under scalar multiplication. Both properties may be combined into the following equivalent single 
statement: 

{b') For every u. v IT. a. b G K. the linear combination au + bv € W. 

Now let V be any vector space. Then V automatically contains two subspaces: the set {0} consisting of 
the zero vector alone and the whole space V itself. These are sometimes called the trivial subspaces of V. 
Examples of nontrivial subspaces follow. 

EXAMPLE 4.6 Consider the vector space V = R 3 . 

(a) Let U consist of all vectors in R 3 whose entries are equal; that is, 

U = {(a, b,c) : a = b = c} 

For example, (1,1,1), ( — 3, —3, —3), (7,7,7), ( — 2, —2, —2) are vectors in U. Geometrically, U is the line 
through the origin O and the point (1,1,1) as shown in Fig. 4-l(a). Clearly 0 = (0,0,0) belongs to U, because all 
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entries in 0 are equal. Further, suppose u and v are arbitrary vectors in U, say, u = (a, a, a) and v = (b,b,b). 
Then, for any scalar k E R, the following are also vectors in U: 

it + v — (a + b, a + b, a + b) and kit — ( ka , ka, ka ) 

Thus, U is a subspace of R 3 . 

(b) Let W be any plane in R 3 passing through the origin, as pictured in Fig. 4-1(b). Then 0 = (0,0,0) belongs to W, 
because we assumed W passes through, the origin O. Further, suppose u and v are vectors in W. Then u and v may 
be viewed as arrows in the plane W emanating from the origin O, as in Fig. 4-1(b). The sum u + v and any 
multiple ku of u also lie in the plane W. Thus, W is a subspace of R 3 . 




Figure 4-1 


EXAMPLE 4.7 

(a) Let V = M„„, the vector space of n x n matrices. Let W x be the subset of all (upper) triangular matrices and let 
W 2 be the subset of all symmetric matrices. Then W t is a subspace of V, because contains the zero matrix 0 
and W t is closed under matrix addition and scalar multiplication; that is, the sum and scalar multiple of such 
triangular matrices are also triangular. Similarly, W 2 is a subspace of V. 

(b) Let V = P(t), the vector space P(f) of polynomials. Then the space P„(f) of polynomials of degree at most n may 
be viewed as a subspace of P (t). Let Q (t) be the collection of polynomials with only even powers of t. For 
example, the following are polynomials in Q(t): 

p x = 3 + 4t 2 — 5t 6 and p 2 = 6 — 7t 4 + 91 6 + 3 t n 

(We assume that any constant k = kt° is an even power of t.) Then Q (t) is a subspace of P(f). 

(c) Let V be the vector space of real-valued functions. Then the collection W x of continuous functions and the 
collection W 2 of differentiable functions are subspaces of V. 


Intersection of Subspaces 

Let U and W be subspaces of a vector space V. We show that the intersection U D W is also a subspace of 
V. Clearly, 0 € U and 0 G W. because U and W are subspaces; whence 0 € U (~1 W. Now suppose u and v 
belong to the intersection U f~l W. Then u, v E IJ and u. v G W. Further, because U and W are subspaces, 
for any scalars a,b € K, 

an + bv £ U and an + bv € W 

Thus, au + bv £ U <1W. Therefore, U D W is a subspace of V. 

The above result generalizes as follows. 


THEOREM 4.3: The intersection of any number of subspaces of a vector space V is a subspace of V. 
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Solution Space of a Homogeneous System 

Consider a system AX = B of linear equations in n unknowns. Then every solution u may be viewed as a 
vector in K n . Thus, the solution set of such a system is a subset of K'\ Now suppose the system is 
homogeneous; that is, suppose the system has the form AX = 0. Let W be its solution set. Because AO = 0, 
the zero vector 0 C W. Moreover, suppose u and v belong to W. Then u and v are solutions of AX = 0, or, 
in other words, Au = 0 and Av — 0. Therefore, for any scalars a and b. we have 

A(au + bv ) = aAu + bAv = aO + b0 = 0 + 0 = 0 

Thus, au + bv belongs to W. because it is a solution of AX = 0. Accordingly, W is a subspace of K". 
We state the above result formally. 

THEOREM 4.4: The solution set W of a homogeneous system AX = 0 in n unknowns is a subspace of 
K”. 

We emphasize that the solution set of a nonhomogeneous system AX = B is not a subspace of K'\ In 
fact, the zero vector 0 does not belong to its solution set. 


4.6 Linear Spans, Row Space of a Matrix 

Suppose u l , u 2 , ■ ■ ■ , u m are any vectors in a vector space V. Recall (Section 4.4) that any vector of the form 
a 1 u 1 + a 2 u 2 + • • • + a m u m , where the o, are scalars, is called a linear combination of Uy, u 2 ,..., u m . The 
collection of all such linear combinations, denoted by 

span(n!, u 2 ,..., u m ) or span(n I ) 

is called the linear span of u^,u 2 ,, u m . 

Clearly the zero vector 0 belongs to span(« ( ), because 
0 — 0 U\ T 0u 2 T * ■ ■ T 0 u m 

Furthermore, suppose v and if belong to span(n,), say, 

v — a\ u | + a 2 u 2 + ■ • • + a m u m and v' = b l u i + b 2 u 2 + ■ ■ • + b m u m 

Then, 

v + v ' — ( a i + b\)u\ + ( a 2 + b 2 )u 2 + • • • + (a m + b m )u m 
and, for any scalar k £ K, 

kv = ka l u l + ka 2 u 2 + • ■ ■ + ka m u m 

Thus, v + if and kv also belong to span ft/,). Accordingly, spanfu,) is a subspace of V. 

More generally, for any subset S of V, span (.S') consists of all linear combinations of vectors in S or, 
when S = (j), spanCS 1 ) = {0}. Thus, in particular, S is a spanning set (Section 4.4) of span(S'). 

The following theorem, which was partially proved above, holds. 

THEOREM 4.5: Let S be a subset of a vector space V. 

(i) Then span (,S) is a subspace of V that contains S. 

(ii) If W is a subspace of V containing S, then span(.S) C W. 

Condition (ii) in theorem 4.5 may be interpreted as saying that span(.S) is the “smallest” subspace of V 
containing S. 

EXAMPLE 4.8 Consider the vector space V = R 3 . 

(a) Let u be any nonzero vector in R 3 . Then span(w) consists of all scalar multiples of u. Geometrically, span(w) is 
the line through the origin O and the endpoint of u, as shown in Fig. 4-2(a). 
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(b) Let u and v be vectors in R 3 that are not multiples of each other. Then span(«, v) is the plane through the origin O 
and the endpoints of u and v as shown in Fig. 4-2(b). 

(c) Consider the vectors ej = (1,0,0), e 2 = (0, 1,0), e 3 = (0,0,1) in R 3 . Recall [Example 4.1(a)] that every vector 
in R 3 is a linear combination of e l , e 2 , e 3 . That is, e t , e 2 , e 3 form a spanning set of R 3 . Accordingly, 

span(e 1 ,e 2 ,e 3 ) = R 3 . 

Row Space of a Matrix 

Let A — [cijj\ be an arbitrary m x n matrix over a field K. The rows of A, 

^1 — ( a ll> fl 12> • • • ^2 — ( a 21i a 22i ■ ■ • > a 2n)i ~ \ a mh a m2i ■ ■ • ) a mn) 

may be viewed as vectors in K n ; hence, they span a subspace of K" called the row space of A and denoted 
by rowsp(A). That is, 

rowsp(A) = span(/?,,/f 2 ,.. .,R m ) 

Analagously, the columns of A may be viewed as vectors in K"' called the column space of A and denoted 
by colsp(A). Observe that colsp(A) = rowsp(A r ). 

Recall that matrices A and B are row equivalent, written A ~ B, if B can be obtained from A by a 
sequence of elementary row operations. Now suppose M is the matrix obtained by applying one of the 
following elementary row operations on a matrix A: 

(1) Interchange and Rj, (2) Replace R : by kR r (3) Replace R f by kRj + R f 

Then each row of M is a row of A or a linear combination of rows of A. Hence, the row space of M is 
contained in the row space of A. On the other hand, we can apply the inverse elementary row operation on 
M to obtain A; hence, the row space of A is contained in the row space of M. Accordingly, A and M have 
the same row space. This will be true each time we apply an elementary row operation. Thus, we have 
proved the following theorem. 

THEOREM 4.6: Row equivalent matrices have the same row space. 

We are now able to prove (Problems 4.45^1.47) basic results on row equivalence (which first 
appeared as Theorems 3.7 and 3.8 in Chapter 3). 

THEOREM 4.7 : Suppose A = [nJ and B — \b ij: are row equivalent echelon matrices with respective 

pivot entries 

ay,) a ij 2 , • • •, a,j r and h \ k] , hk r,, • • •, b sk s 

Then A and B have the same number of nonzero rows—that is, r — s —and their pivot 
entries are in the same positions—that is, j { — k l ,j 2 — k 2) ■ ■. ,j r = k r . 

THEOREM 4.8: Suppose A and B are row canonical matrices. Then A and B have the same row space 
if and only if they have the same nonzero rows. 
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COROLLARY 4.9: Every matrix A is row equivalent to a unique matrix in row canonical form. 
We apply the above results in the next example. 


EXAMPLE 4.9 Consider the following two sets of vectors in R 4 : 

M , = ( 1 , 2 ,- 1 , 3 ), «2 = ( 2 , 4 , 1 ,- 2 ), w 3 = ( 3 , 6 , 3 , - 7 ) 

w i — ( 1 , 2 , — 4 , 11 ), w 2 — ( 2 , 4 , — 5 , 14 ) 

Let U = span(M ; ) and W = span(vr i ). There are two ways to show that U = W. 

(a) Show that each u t is a linear combination of vv, and w 2 , and show that each w, is a linear combination of iq , u 2 , 
m 3 . Observe that we have to show that six systems of linear equations are consistent. 

(b) Form the matrix A whose rows are u x , u 2 , u 2 and row reduce A to row canonical form, and form the matrix B 
whose rows are w 1 and w 2 and row reduce B to row canonical form: 



'1 

2 

-1 

3 ' 


'1 

2 

-1 

3 ' 


"1 

2 

0 

1 ' 

3 

A = 

2 

4 

1 

-2 

r-j 

0 

0 

3 

-8 

r-j 

0 

0 

1 

8 

3 


3 

6 

3 

-7 


0 

0 

6 

-16 


0 

0 

0 

0 


- 

"1 

2 

-4 

ir 


1 

2 

0 

i ' 

3 

_ 

.0 

0 

3 

-8. 


0 

0 

1 

8 

3 


Because the nonzero rows of the matrices in row canonical form are identical, the row spaces of A and B are 
equal. Therefore, U = W. 

Clearly, the method in (b) is more efficient than the method in (a). 


4.7 Linear Dependence and Independence 


Let V be a vector space over a field K. The following defines the notion of linear dependence and 
independence of vectors over K. (One usually suppresses mentioning K when the field is understood.) This 
concept plays an essential role in the theory of linear algebra and in mathematics in general. 

DEFINITION 4.3: We say that the vectors v 1 ,v 2 ,..., v m in V are linearly dependent if there exist 

scalars a l ,a 2 ,..., a m in K, not all of them 0, such that 

a x v x + a 2 v 2 H-b a m v m = 0 

Otherwise, we say that the vectors are linearly independent. 

The above definition may be restated as follows. Consider the vector equation 

*i v\ + x 2 v 2 + ■ ■ ■ + x m v m = 0 (*) 

where the x’s are unknown scalars. This equation always has the zero solution x, = 0, x 2 — 0,..., x m = 0. 
Suppose this is the only solution; that is, suppose we can show: 

Xjiq + x 2 v 2 + • • • + x m v m = 0 implies x 1 = 0, x 2 = 0, ..., x m = 0 

Then the vectors v^, v 2 ,..., v m are linearly independent, On the other hand, suppose the equation (*) has a 
nonzero solution; then the vectors are linearly dependent. 

A set S — {v |. v 2 ..... v ,„} of vectors in V is linearly dependent or independent according to whether 
the vectors v l ,v 2 ,... ,v m are linearly dependent or independent. 

An infinite set S of vectors is linearly dependent or independent according to whether there do or do not 
exist vectors v l , v 2 ,..., v k in S that are linearly dependent. 

Warning: The set S = { v ,, v 2 ,, v m } above represents a list or, in other words, a finite sequence 
of vectors where the vectors are ordered and repetition is permitted. 
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The following remarks follow directly from the above definition. 

Remark 1: Suppose 0 is one of the vectors v l ,v 2 , ■ ■ ■, v m , say v x = 0. Then the vectors must be 
linearly dependent, because we have the following linear combination where the coefficient of v , 0: 

1 T Ouo T ■ ■ • T 0v m = 1 • 0 T 0 T ■ • • 0 — 0 

Remark 2: Suppose v is a nonzero vector. Then v, by itself, is linearly independent, because 
kv — 0, r^O implies k = 0 

Remark 3: Suppose two of the vectors v x , v 2 , ■ ■ ■, v m are equal or one is a scalar multiple of the 
other, say v x = kv 2 . Then the vectors must be linearly dependent, because we have the following linear 
combination where the coefficient of v 1 f 0: 

v x - kv 2 + 0u 3 -f-1- 0v m = 0 

Remark 4: Two vectors v x and v 2 are linearly dependent if and only if one of them is a multiple of 
the other. 

Remark 5: If the set {tq, ..., v m } is linearly independent, then any rearrangement of the vectors 
{ v if , v i2 ,. .. , Vj n } is also linearly independent. 

Remark 6: If a set S of vectors is linearly independent, then any subset of S is linearly independent. 
Alternatively, if S contains a linearly dependent subset, then S is linearly dependent. 

EXAMPLE 4.10 

(a) Let u = (1,1,0), v = (1,3,2), w = (4,9,5). Then u, v, w are linearly dependent, because 

3u + 5u-2w = 3(1,1,0) + 5(1,3,2) -2(4,9,5) = (0,0,0) = 0 

(b) We show that the vectors u = (1,2,3), v = (2,5,7), w = (1,3,5) are linearly independent. We form the vector 
equation xu + yv + zw = 0, where x, y, z are unknown scalars. This yields 

x + 2y + z = 0 x + 2y + z = 0 

or 2x + 5_v + 3r = 0 or y + z = 0 

3x + ly + 5r = 0 2z = 0 

Back-substitution yields x = 0, y = 0, z = 0. We have shown that 

xu + yv + zw = 0 implies x = 0, y = 0, z = 0 

Accordingly, u, v, w are linearly independent. 

(c) Let V be the vector space of functions from R into R. We show that the functions/(t) = sin t, g(t) = e', h(t) = t 2 
are linearly independent. We form the vector (function) equation xf + yg + zh = 0, where x, y, z are unknown 
scalars. This function equation means that, for every value of t, 

x sin t +ye' + zt 2 = 0 

Thus, in this equation, we choose appropriate values of t to easily get x = 0, y = 0, z = 0. For example, 


(i) Substitute t = 0 

to obtain x(0) +y(l) +z(0) = 0 

or 

y — 0 

(ii) Substitute t = n 

to obtain x(0) + 0(e n ) + z(7i 2 ) = 0 

or 

z = 0 

(iii) Substitute t = tc 2 

to obtain jc( 1) + 0(e’ t / 2 ) + 0(7i 2 /4) = 0 

or 

x = 0 


We have shown 


xf +yg + zf = 0 implies x = 0, y = 0, z = 0 
Accordingly, u, v, w are linearly independent. 



i 


2 
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0 
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Linear Dependence in R 3 

Linear dependence in the vector space V — R 3 can be described geometrically as follows: 

(a) Any two vectors u and v in R 3 are linearly dependent if and only if they lie on the same line through 
the origin O, as shown in Fig. 4-3(a). 

(b) Any three vectors u, v, w in R 3 are linearly dependent if and only if they lie on the same plane 
through the origin O, as shown in Fig. 4-3(b). 

Later, we will be able to show that any four or more vectors in R 3 are automatically linearly dependent. 




(b) u, v, and w are linearly dependent 


Figure 4-3 


Linear Dependence and Linear Combinations 

The notions of linear dependence and linear combinations are closely related. Specifically, for more than 
one vector, we show that the vectors v l , v 2 , ■ ■ ■, v m are linearly dependent if and only if one of them is a 
linear combination of the others. 

Suppose, say, v t is a linear combination of the others, 


Vi = a t«t + ' • • + a i-i v i-i + a i+1 v i+ i + • • ■ + a m v m 

Then by adding — v t to both sides, we obtain 

a\v x +-hai-t^i-i - v i + a i+i v i+i + • • • + a m v m = 0 

where the coefficient of v t is not 0. Hence, the vectors are linearly dependent. Conversely, suppose the 
vectors are linearly dependent, say, 

b 1 v l 4-1- bjVj H-1- b m v m = 0, where bj ^ 0 


Then we can solve for V: obtaining 


Vj - bj l b lVl - bj V i Vi - bj 'b 


U j U j+l V j+l 


b l b v 

w j u m 


and so Vj is a linear combination of the other vectors. 

We now state a slightly stronger statement than the one above. This result has many important 
consequences. 


LEMMA 4.10: Suppose two or more nonzero vectors u 1; v 2 , ■ ■ ■, v m are linearly dependent. Then one 
of the vectors is a linear combination of the preceding vectors; that is, there exists k > 1 
such that 


% = c l^l + C 2 V2 + ' ' ' + C k _ x V k _ x 
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Linear Dependence and Echelon Matrices 

Consider the following echelon matrix A, whose pivots have been circled: 


A = 


0 © 3 
0 0 @ 
0 0 0 
0 0 0 
0 0 0 


4 5 6 7 
3 2 3 4 
0 © 8 9 
0 0 © 7 
0 0 0 0 


Observe that the rows R 2 , R 3 , R 4 have 0’s in the second column below the nonzero pivot in R { , and hence 
any linear combination of R 2 , R 2 , R 4 must have 0 as its second entry. Thus, R { cannot be a linear 
combination of the rows below it. Similarly, the rows R 3 and R 4 have 0’s in the third column below the 
nonzero pivot in R 2 , and hence R 2 cannot be a linear combination of the rows below it. Finally, R 3 cannot 
be a multiple of R 4 , because R 4 has a 0 in the fifth column below the nonzero pivot in R 3 . Viewing the 
nonzero rows from the bottom up, R 4 , R 3 , R 2 . R t , no row is a linear combination of the preceding rows. 
Thus, the rows are linearly independent by Lemma 4.10. 

The argument used with the above echelon matrix A can be used for the nonzero rows of any echelon 
matrix. Thus, we have the following very useful result. 

THEOREM 4.11: The nonzero rows of a matrix in echelon form are linearly independent. 


4.8 Basis and Dimension 


First we state two equivalent ways to define a basis of a vector space V. (The equivalence is proved in 
Problem 4.28.) 

DEFINITION 4.4 A: A set S = {u { , u 2 .... , u n } of vectors is a basis of V if it has the following two 

properties: (1) S is linearly independent. (2) S spans V. 

DEFINITION 4.4 B: A set S = {iq, u 2 ,.... u n } of vectors is a basis of V if every v € V can be written 

uniquely as a linear combination of the basis vectors. 

The following is a fundamental result in linear algebra. 

THEOREM 4.12: Let V be a vector space such that one basis has m elements and another basis has n 
elements. Then m = n. 

A vector space V is said to be of finite dimension n or n-dimensional, written 
dim V — n 

if V has a basis with n elements. Theorem 4.12 tells us that all bases of V have the same number of 
elements, so this definition is well defined. 

The vector space {0} is defined to have dimension 0. 

Suppose a vector space V does not have a finite basis. Then V is said to be of infinite dimension or to be 
infinite-dimensional. 

The above fundamental Theorem 4.12 is a consequence of the following "replacement lemma” (proved 
in Problem 4.35). 

LEMMA 4.13: Suppose {n 1; v 2 ,.. ., v n } spans V, and suppose {w 1 , w 2 , . .., w m } is linearly indepen¬ 
dent. Then m < n. and V is spanned by a set of the form 

{w l ,w 2 ,...,w m , v i{ ,v h ,...,v in m } 

Thus, in particular, n + 1 or more vectors in V are linearly dependent. 

Observe in the above lemma that we have replaced m of the vectors in the spanning set of V by the m 
independent vectors and still retained a spanning set. 
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Examples of Bases 

This subsection presents important examples of bases of some of the main vector spaces appearing in this 
text. 

(a) Vector space K Consider the following n vectors in K 

e i — (1,0,0,0,... ,0,0), e 2 = (0,1,0,0,... ,0,0), •••, e„ = (0,0,0,0,... ,0,1) 

These vectors are linearly independent. (For example, they form a matrix in echelon form.) 
Furthermore, any vector u = (a l ,a 2 , ■ ■ ■ ,a n ) in K" can be written as a linear combination of the 
above vectors. Specifically, 

v — <7|C| + a 2 e 2 + ■ ■ ■ + a„e„ 

Accordingly, the vectors form a basis of K n called the usual or standard basis of K". Thus (as one 
might expect), K n has dimension n. In particular, any other basis of K" has n elements. 

(b) Vector space M = M r 5 of all r x s matrices: The following six matrices form a basis of the 
vector space M 2 3 of all 2 x 3 matrices over K: 

"1 0 0] [0 1 0] [0 0 1] [0 0 0] [0 0 0] [0 0 o' 

0 o oj ’ [o o oj ’ [o o oj ’ |_1 o oj ’ [o 1 0 J ’ [o 0 1 

More generally, in the vector space M = M r s . of all r x s matrices, let E tJ be the matrix with //-entry 1 
and 0’s elsewhere. Then all such matrices form a basis of M,. s called the usual or standard basis of 
M rs . Accordingly, dim M ( . s = rs. 

(c) Vector space P n (t) of all polynomials of degree < n: The set S = { \ ,t.t 2 . r\ ... ,f } of n + 1 

polynomials is a basis of P n (t). Specifically, any polynomial f(t) of degree <n can be expessed as a 
linear combination of these powers of t, and one can show that these polynomials are linearly 
independent. Therefore, dim P n (t) — n + 1. 

(d) Vector space P(f) of all polynomials: Consider any finite set S — {f\{t),f 2 (t),... ,f m (t)} of 
polynomials in P(t), and let m denote the largest of the degrees of the polynomials. Then any 
polynomial g(t) of degree exceeding m cannot be expressed as a linear combination of the elements of 
S. Thus, S cannot be a basis of P(r). This means that the dimension of V(t) is infinite. We note that the 
infinite set S' = {1, t, t 2 , f 3 , ...}, consisting of all the powers of t, spans P(f) and is linearly 
independent. Accordingly, S' is an infinite basis of P(f). 

Theorems on Bases 

The following three theorems (proved in Problems 4.37, 4.38, and 4.39) will be used frequently. 

THEOREM 4.14: Let V be a vector space of finite dimension n. Then: 

(i) Any n + 1 or more vectors in V are linearly dependent. 

(ii) Any linearly independent set S = {u l ,u 2 ,... ,u n } with n elements is a basis 
of v: 

(iii) Any spanning set T = {uj, v 2 , .... v n } of V with n elements is a basis of V. 

THEOREM 4.15: Suppose S spans a vector space V. Then: 

(i) Any maximum number of linearly independent vectors in S form a basis of V. 

(ii) Suppose one deletes from S every vector that is a linear combination of 
preceding vectors in S. Then the remaining vectors form a basis of V. 
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THEOREM 4.16: Let V be a vector space of finite dimension and let S = {u i . u 2 ,..., w,.} be a set of 
linearly independent vectors in V. Then S is part of a basis of V, that is, S may be 
extended to a basis of V. 

EXAMPLE 4.11 

(a) The following four vectors in R 4 form a matrix in echelon form: 

(1,1,1,1), (0,1,1,1), (0,0,1,1), (0,0,0,1) 

Thus, the vectors are linearly independent, and, because dimR 4 = 4, the four vectors form a basis of R 4 . 

(b) The following n + I polynomials in P„(r) are of increasing degree: 

1, t- 1, (t- l) 2 , ..., (t- 1)" 

Therefore, no polynomial is a linear combination of preceding polynomials; hence, the polynomials are linear 
independent. Furthermore, they form a basis of P„(f), because dimP„(?) = n + 1. 

(c) Consider any four vectors in R 3 , say 

(257,-132,58), (43,0,-17), (521,-317,94), (328,-512,-731) 

By Theorem 4.14(i), the four vectors must be linearly dependent, because they come from the three-dimensional 
vector space R 3 . 

Dimension and Subspaces 

The following theorem (proved in Problem 4.40) gives the basic relationship between the dimension of a 
vector space and the dimension of a subspace. 

THEOREM 4.17: Let IT be a subspace of an n-dimensional vector space V. Then dim IT < n. In 
particular, if dim IT = n, then W = V. 

EXAMPLE 4.12 Let IT be a subspace of the real space R 3 . Note that dimR 3 = 3. Theorem 4.17 tells us that the 
dimension of IT can only be 0, 1, 2, or 3. The following cases apply: 

(a) If dim IT = 0, then IT = {0}, a point. 

(b) If dim IT = 1, then IT is a line through the origin 0. 

(c) If dim IT = 2, then IT is a plane through the origin 0. 

(d) If dim IT = 3, then IT is the entire space R 3 . 

4.9 Application to Matrices, Rank of a Matrix 


Let A be any m x n matrix over a field K. Recall that the rows of A may be viewed as vectors in K" and 
that the row space of A, written rowsp(A), is the subspace of K" spanned by the rows of A. The following 
definition applies. 

DEFINITION 4.5: The rank of a matrix A, written rank(A), is equal to the maximum number of linearly 

independent rows of A or, equivalently, the dimension of the row space of A. 

Recall, on the other hand, that the columns of an m x n matrix A may be viewed as vectors in K m and 
that the column space of A, written colsp(A), is the subspace of K m spanned by the columns of A. Although 
m may not be equal to n —that is, the rows and columns of A may belong to different vector spaces—we 
have the following fundamental result. 

THEOREM 4.18: The maximum number of linearly independent rows of any matrix A is equal to the 
maximum number of linearly independent columns of A. Thus, the dimension of the 
row space of A is equal to the dimension of the column space of A. 

Accordingly, one could restate the above definition of the rank of A using columns instead of rows. 
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Basis-Finding Problems 

This subsection shows how an echelon form of any matrix A gives us the solution to certain problems 
about A itself. Specifically, let A and B be the following matrices, where the echelon matrix B (whose 
pivots are circled) is an echelon form of A: 


"1 

2 

1 

3 

1 

2' 


r® 

2 

1 

3 

1 

2" 

2 

5 

5 

6 

4 

5 


0 

® 

3 

1 

2 

1 

3 

7 

6 

11 

6 

9 

and B = 

0 

0 

0 

® 

1 

2 

1 

5 

10 

8 

9 

9 


0 

0 

0 

0 

0 

0 

2 

6 

8 

11 

9 

12 


0 

0 

0 

0 

0 

0 


We solve the following four problems about the matrix A, where C 1( C 2 ,..., C 6 denote its columns: 

(a) Find a basis of the row space of A. 

(b) Find each column C k of A that is a linear combination of preceding columns of A. 

(c) Find a basis of the column space of A. 

(d) Find the rank of A. 

(a) We are given that A and B are row equivalent, so they have the same row space. Moreover, B is in 
echelon form, so its nonzero rows are linearly independent and hence form a basis of the row space of 
B. Thus, they also form a basis of the row space of A. That is, 

basis of rowsp(A): (1,2,1,3,1,2), (0,1,3,1,2,1), (0,0,0,1,1,2) 

(b) Let M k = [C,, C 2 ..... C/,]. the submatrix of A consisting of the first k columns of A. Then M k _ 1 and 
M k are, respectively, the coefficient matrix and augmented matrix of the vector equation 

XjCj +x 2 C 2 + ■ ■ ■ +x k -\C k _x = C k 

Theorem 3.9 tells us that the system has a solution, or, equivalently, C k is a linear combination of the 
preceding columns of A if and only if rank (M k ) = rank (M k _i), where rank (M k ) means the number of 
pivots in an echelon form of M k . Now the first k column of the echelon matrix B is also an echelon 
form of M k . Accordingly, 

rank (M 2 ) = rank (M 3 ) = 2 and rank (M 4 ) = rank(M 5 ) = rank(M 6 ) = 3 

Thus, C 3 , C 5 , C 6 are each a linear combination of the preceding columns of A. 

(c) The fact that the remaining columns C,, C 2 , C 4 are not linear combinations of their respective 
preceding columns also tells us that they are linearly independent. Thus, they form a basis of the 
column space of A. That is, 

basis of colsp(A): [1,2,3, l,2] r , [2,5,7,5,6] r , [3,6,11,8,11] r 

Observe that C x , C 2 , C 4 may also be characterized as those columns of A that contain the pivots in 
any echelon form of A. 

(d) Flere we see that three possible definitions of the rank of A yield the same value. 

(i) There are three pivots in B, which is an echelon form of A. 

(ii) The three pivots in B correspond to the nonzero rows of B, which form a basis of the row space 
of A. 

(iii) The three pivots in B correspond to the columns of A, which form a basis of the column space 
of A. 


Thus, rank(A) = 3. 
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Application to Finding a Basis for IV = span(u 1 . u 2 ..... u r ) 

Frequently, we are given a list S — {it ,. u 2 ,..., u r } of vectors in K" and we want to find a basis for the 
subspace W of K" spanned by the given vectors—that is, a basis of 

W = span(S) = span(«|, u 2 , ■ ■ ■, u r ) 

The following two algorithms, which are essentially described in the above subsection, find such a basis 
(and hence the dimension) of W. 

Algorithm 4.1 (Row space algorithm) 

Step 1. Form the matrix M whose rows are the given vectors. 

Step 2. Row reduce M to echelon form. 

Step 3. Output the nonzero rows of the echelon matrix. 

Sometimes we want to find a basis that only comes from the original given vectors. The next algorithm 
accomplishes this task. 

Algorithm 4.2 (Casting-out algorithm) 

Step 1. Form the matrix M whose columns are the given vectors. 

Step 2. Row reduce M to echelon form. 

Step 3. For each column C k in the echelon matrix without a pivot, delete (cast out) the vector u k from 
the list S of given vectors. 

Step 4. Output the remaining vectors in S (which correspond to columns with pivots). 

We emphasize that in the first algorithm we form a matrix whose rows are the given vectors, whereas in 

the second algorithm we form a matrix whose columns are the given vectors. 

EXAMPLE 4.13 Let W be the subspace of R 5 spanned by the following vectors: 

Mj = (1,2,1,3,2), u 2 = (1,3,3,5,3), w 3 = (3,8,7,13,8) 
u 4 = (1,4,6,9,7), u 5 = (5,13,13,25,19) 

Find a basis of W consisting of the original given vectors, and find dim W. 

Form the matrix M whose columns are the given vectors, and reduce M to echelon form: 

'11315] [11315' 

23 84 13 01223 

M = 1 3 76 13^00012 

35 13 9 25 00000 

2 3 8 7 19 J [0 0 0 0 0. 

The pivots in the echelon matrix appear in columns C 1; C 2 , C 4 . Accordingly, we “cast out” the vectors « 3 and u 5 from 

the original five vectors. The remaining vectors w,, u 2 , u 4 , which correspond to the columns in the echelon matrix 
with pivots, form a basis of W. Thus, in particular, dim IT = 3. 

Remark: The justification of the casting-out algorithm is essentially described above, but we repeat 
it again here for emphasis. The fact that column C 3 in the echelon matrix in Example 4.13 does not have a 
pivot means that the vector equation 

xu l +yu 2 = m 3 

has a solution, and hence w 3 is a linear combination of u x and u 2 . Similarly, the fact that C 5 does not have a 
pivot means that u 5 is a linear combination of the preceding vectors. We have deleted each vector in the 
original spanning set that is a linear combination of preceding vectors. Thus, the remaining vectors are 
linearly independent and form a basis of W. 
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Application to Homogeneous Systems of Linear Equations 

Consider again a homogeneous system AX = 0 of linear equations over K with n unknowns. By 
Theorem 4.4, the solution set W of such a system is a subspace of K n , and hence W has a dimension. 
The following theorem, whose proof is postponed until Chapter 5, holds. 

THEOREM 4.19: The dimension of the solution space IT of a homogeneous system AX = 0 is n — r, 
where n is the number of unknowns and r is the rank of the coefficient matrix A. 

In the case where the system AX = 0 is in echelon form, it has precisely n — r free variables, say 
x (| , x i2 ,, x in . Let Vj be the solution obtained by setting x r = 1 (or any nonzero constant) and the 
remaining free variables equal to 0. We show (Problem 4.50) that the solutions v l ,v 2 ,-.., v n _ r are linearly 
independent; hence, they form a basis of the solution space W. 

We have already used the above process to find a basis of the solution space W of a homogeneous 
system AX = 0 in Section 3.11. Problem 4.48 gives three other examples. 


4.10 Sums and Direct Sums 


Let U and W be subsets of a vector space V. The sum of U and W, written U + W, consists of all sums 
u + w where u £ U and w £ W. That is, 

U + W — {v : v = u + w, where u £ U and w £ W} 

Now suppose U and W are subspaces of V. Then one can easily show (Problem 4.53) that U + W is a 
subspace of V. Recall that U £l IT is also a subspace of V. The following theorem (proved in Problem 4.58) 
relates the dimensions of these subspaces. 


THEOREM 4.20: Suppose U and IT are finite-dimensional subspaces of a vector space V. Then U + W 
has finite dimension and 

dim (U + W) = dim U + dim W - dim(C7 n W) 

EXAMPLE 4.14 Let V = M 2 2 , the vector space of 2 x 2 matrices. Let U consist of those matrices whose second 
row is zero, and let IT consist of those matrices whose second column is zero. Then 


U = 



W — 


a 0 
c 0 


and U+W 



unw = 


a 

0 


0 

0 


That is, U+W consists of those matrices whose lower right entry is 0, and U fl W consists of those matrices 
whose second row and second column are zero. Note that dim U = 2, dim IT =2, dim([/fl IT) = 1. Also, 
dim((7 + W) = 3, which is expected from Theorem 4.20. That is, 

dim( U + W) = dim U + dim V — dim( U H IT) = 2 + 2— 1 = 3 


Direct Sums 

The vector space V is said to be the direct sum of its subspaces U and W, denoted by 

v = u®w 

if every v £ V can be written in one and only one way as v — u + w where u £ U and vv £ IT. 

The following theorem (proved in Problem 4.59) characterizes such a decomposition. 

THEOREM 4.21: The vector space V is the direct sum of its subspaces U and IT if and only if: 
(i)V=U+W, (ii) UHW= {0}. 
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EXAMPLE 4.15 Consider the vector space V = R 3 . 

(a) Let U be the xy-plane and let W be the yz-plane; that is, 

U — {(a, b, 0) : a, b £ R} and W — {(0, b, c) : b, c £ R} 

Then R 3 = U + W, because every vector in R 3 is the sum of a vector in U and a vector in W. However, R 3 is not 
the direct sum of U and W, because such sums are not unique. For example, 

(3,5,7) = (3,1,0)+ (0,4,7) and also (3,5,7) = (3, -4,0) + (0,9,7) 

(b) Let U be the xy-plane and let W be the z-axis; that is, 

U — {(a, b, 0) : a, b £ R} and W = {(0,0, c) : c £ R} 

Now any vector (a, b, c) 6 R 3 can be written as the sum of a vector in U and a vector in V in one and only one 
way: 

(a, b, c ) = (a, b, 0) + (0,0, c) 

Accordingly, R 3 is the direct sum of U and W', that is, R 3 = U © W. 


General Direct Sums 

The notion of a direct sum is extended to more than one factor in the obvious way. That is, V is the direct 
sum of subspaces W l ,W 2 ,..., W r , written 

V = Wj © W 2 © ■ ■ ■ © W r 

if every vector v € V can be written in one and only one way as 

V = w l + w 2 + ■ ■ ■ + w r 

where w l £ W l , w 2 £ W 2 ,..,, w r £ W r . 

The following theorems hold. 

THEOREM 4.22: Suppose V = W\ © W 2 © ■ ■ ■ © W r . Also, for each k, suppose S k is a linearly 
independent subset of W k . Then 

(a) The union S = (Jr- S k ' s linearly independent in V. 

(b) If each S k is a basis of W k , then (J A . S k is a basis of V. 

(c) dim V = dim IV, + dim W 2 + • • • + dim W r . 

THEOREM 4.23: Suppose V = + W 2 + ■ ■ ■ + W r and dim V = J2 k dim W k . Then 

V = Wi © W 2 © ■ ■ ■ © W r . 

4.11 Coordinates 

Let V be an /(-dimensional vector space over K with basis S — {u l ,u 2 , ■ ■ ■ ,u n }. Then any vector v £ V can 
be expressed uniquely as a linear combination of the basis vectors in S, say 

v = a l u l + a 2 u 2 + • • • + a n u„ 

These n scalars a x , a 2 ,..., a n are called the coordinates of v relative to the basis S, and they form a vector 
[a l ,a 2 ,.... a n \ in K" called the coordinate vector of v relative to S. We denote this vector by [/;] s . or 
simply [u], when S is understood. Thus, 

M s = [a u a 2 ,...,a n \ 

For notational convenience, brackets [...], rather than parentheses (...), are used to denote the coordinate 
vector. 
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Remark: The above n scalars a 1 ,a 2 ,...,a„ also form the coordinate column vector 
[a l ,a 2 ,... ,a n ] T of v relative to S. The choice of the column vector rather than the row vector to 
represent v depends on the context in which it is used. The use of such column vectors will become clear 
later in Chapter 6. 

EXAMPLE 4.16 Consider the vector space P 2 (f) of polynomials of degree <2. The polynomials 
/>i = f+l, p 2 — t—l, Pi = (t - l) 2 = C - 2 t + 1 

form a basis S of P 2 (f). The coordinate vector [«] of v = 2t 2 — 5t + 9 relative to S is obtained as follows. 

Set v = xpi + yp 2 + zp 2 using unknown scalars x, y, z, and simplify: 

2 t 2 — 5f + 9 = x(t + 1) + y(t — 1) + z(t~ — 2t + 1) 

= xt + x + yt — y + zt 2 — 2 zt + z 
— zt 2 + (x + y - 2 z)t + (x - y + z) 

Then set the coefficients of the same powers of t equal to each other to obtain the system 

z — 2, x + y-2z=-5, x — y + z — 9 

The solution of the system is x = 3, y = —4, z = 2. Thus, 

v = 3p 1 — 4 p 2 + 2 p 3 , and hence, [u] = [3, —4,2] 

EXAMPLE 4.17 Consider real space R 3 . The following vectors form a basis S of R 3 : 

«! = (!,-1,0), u 2 — (1,1,0), « 3 = (0,1,1) 

The coordinates of v = (5,3,4) relative to the basis S are obtained as follows. 

Set v = XV] + yv 2 + zvp, that is, set rasa linear combination of the basis vectors using unknown scalars x, y, z. 
This yields 


'5' 


r 


T 


'O' 

3 

= X 

-i 

+ T 

l 

+ z 

1 

4 


0 


0 


1 


The equivalent system of linear equations is as follows: 

x + y = 5, —x + y + z— 3, z = 4 
The solution of the system is x = 3, y = 2, z = 4. Thus, 
v = 3u] + 2 u 2 + 4u 3 , and so [u] J = [3,2,4] 

Remark 1: There is a geometrical interpretation of the coordinates of a vector v relative to a basis S 
for the real space R", which we illustrate using the basis S of R 3 in Example 4.17. First consider the space 
R 3 with the usual x, y, z axes. Then the basis vectors determine a new coordinate system of R 3 , say with a 7 , 
/, c axes, as shown in Fig. 4-4. That is, 

(1) The x 7 -axis is in the direction of u t with unit length ||||. 

(2) The y'-axis is in the direction of u 2 with unit length ||m 2 ||. 

(3) The z'-axis is in the direction of n 3 with unit length |h 3 ||. 

Then each vector v = (a, b. c) or, equivalently, the point P(a, b, c) in R 3 will have new coordinates with 

respect to the new x', /, z! axes. These new coordinates are precisely [u] s , the coordinates of v with respect 
to the basis S. Thus, as shown in Example 4.17, the coordinates of the point P(5, 3,4) with the new axes 
form the vector [3,2,4]. 


Remark 2: Consider the usual basis E = {e 1 , e 2 , ■ ■ ■, e n } of K" defined by 

<?! = (1,0,0,... ,0,0), e 2 — (0,1,0,... ,0,0), ..., e n = (0,0,0,... ,0,1) 
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z 



V = ( 5 , 3 , 4 ) = [ 3 , 2 , 4 ] 

Figure 4-4 


Let v — (a x , a 2 , ■ ■ ., a n ) be any vector in K". Then one can easily show that 

v = a 1 e l + a 2 e 2 + • • • + a n e n , and so [v] E — [a,, a 2 , ■ . . ,a H ] 

That is, the coordinate vector [w] £ of any vector v relative to the usual basis E of K" is identical to the 
original vector v. 


4.12 Isomorphism of V and K n 

Let V be a vector space of dimension n over K, and suppose S — {u { , u 2 , ■ ■ ■, u n } is a basis of V. Then each 
vector v € V corresponds to a unique n-tuple [w] s in K". On the other hand, each n-tuple [c ,, c 2 •..., c n \ in 
K" coiTesponds to a unique vector c , u , + c 2 ih + ■ ■ ■ + c n ii n in V. Thus, the basis S induces a one-to-one 
correspondence between V and K". Furthermore, suppose 

v — ci^iy + a 2 u 2 + ■ • • + a n u n and w = b l u l + b 2 u 2 + • ■ ■ + b n u n 

Then 

V + w = (flj + b x )«! + (a 2 + b 2 )u 2 + ■ ■ • + (a n + b n )u n 
k v — ( ka l )u l + ( ka 2 )u 2 + ■ • • + (, ka n )u n 

where k is a scalar. Accordingly, 

b + Hs = [«t + b u a n + b n ] = [a u .. .,a n ] + [b u .. .,b„] = [w] s + [w] s 

[kai,ka 2 ,...,ka n } = k[a u a 2 ,... ,a n ] = k[v] s 

Thus, the above one-to-one correspondence between V and K" preserves the vector space operations of 
vector addition and scalar multiplication. We then say that V and K" are isomorphic, written 


We state this result formally. 
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THEOREM 4.24: Let V be an /(-dimensional vector space over a field K. Then V and K" are 
isomorphic. 

The next example gives a practical application of the above result. 


EXAMPLE 4.18 Suppose we want to determine whether or not the following matrices in V = M 23 are linearly 
dependent: 


1 2 -3] 


4 0 1 ’ 


B = 


1 3 

6 5 


-4 
4 ’ 


3 8 -11 

16 10 9 


The coordinate vectors of the matrices in the usual basis of M 2 3 are as follows: 


[A] = [1,2, -3,4,0,1], 


[#] — [1,3,—4,6,5,4], 


[C] = [3, 


-11,16,10,9] 


Fomi the matrix M whose rows are the above coordinate vectors and reduce M to an echelon form: 


'1 

2 

-3 

4 

0 

r 


“1 

2 

-3 

4 

0 

r 


'1 

2 

-3 

4 

0 

r 

1 

3 

-4 

6 

5 

4 


0 

1 

-1 

2 

5 

3 

r-j 

0 

1 

-1 

2 

5 

3 

3 

8 

-11 

16 

10 

9 


0 

2 

-2 

4 

10 

6 


0 

0 

0 

0 

0 

0 


Because the echelon matrix has only two nonzero rows, the coordinate vectors [A], \B\, [C] span a subspace of 
dimension 2 and so are linearly dependent. Accordingly, the original matrices A, B, C are linearly dependent. 


4.13 Full Rank Factorization 


A matrix B is said to have full row rank r if B has r rows that are linearly independent, and a matrix C is 
said to have full column rank r if C has r columns that are linearly independent. 

DEFINITION 4.6: Let A be a m x n matrix of rank r. Then A is said to have the full rank factorization 

A = BC 

where B has full-column rank r and C has full-row rank r. 

THEOREM 4.25: Every matrix A with rank r > 0 has a full rank factorization. 

There are many full rank factorizations of a matrix A. Fig. 4-5 gives an algorithm to find one such 
factorization. 


Algorithm 4-1: The input is a matrix A of rank r > 0. The output is a full rank factorization of A. 
Step 1. Find the row cannonical form M of A. 

Step 2. Fet B be the matrix whose columns are the columns of A corresponding to the columns of M 
with pivots. 

Step 3. Fet C be the matrix whose rows are the nonzero rows of M. 

Then A = BC is a full rank factorization of A. 


Figure 4-5 


EXAMPLE 4.19 

form of A. We set 



1 

1 

-1 

2' 


'1 

1 

0 

r 


Let A = 

2 

2 

-1 

3 

where M = 

0 

0 

1 

-1 

is the row cannonical 


-1 

-1 

2 

-3 


0 

0 

0 

0 



B = 


1 

2 

-1 


-1 

-1 

2 


and 



1 0 1 

0 1 -1 


Then A = BC is a full rank factorization of A. 
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4.14 Generalized (Moore-Penrose) Inverse 

Here we assume that the field of scalars is the complex field C where the matrix A H is the conjugate 
transpose of a matrix A. [If A is a real matrix, then A H = A T .] 

DEFINITION 4.7: Let A be an m x n matrix over C. A matrix, denoted by A + , is called the 

pseudoinverse or Morre-Penrose inverse or MP-inverse of A if A + satisfies the 
following four equations: 

[MP1] AXA = A, [MP3](AX) h = AX, 

[MP2] XAX = X, [MP4] (XXf= XA, 

Clearly, A + is an n x m matrix. Also, A + = A 1 if A is nonsingular. 

LEMMA 4.26: A + is unique (when it exists). 

Proof. Suppose X and Y satisfy the four MP equations. Then 

AY = (AT) h = ( AXAY) h = (Ay) H (AX) H = AYAX = ( AYA)X = AX 

The first and fourth equations use [MP3], and the second and last equations use [MP1]. Similarly, 
YA = XA (which uses [MP4] and [MP1]). Then, 

Y = YAY = ( YA)Y = (. XA)Y = X(AY) = X(AX) = X 

where the first equation uses [MP2]. 

LEMMA 4.27: A + exists for any matrix A. 

Fig. 4-6 gives an algorithm that finds an MP-inverse for any matrix A. 

Algorithm 4-2: Input is an m x n matrix A over C of rank r. Output is A + . 

[A A-i 1 

Step 1. Interchange rows and columns of A so that PAQ = 11 where A n is a nonsingular 

[^21 ^22 J 

r x r block. [Here P and Q are the products of elementary matrices corresponding to the 
interchanges of the rows and columns.] 

Step 2. Set B = ^ 11 and C = [/,., Ajj 1 A, 2 ] where /,. is the r x r identity matrix. 

\_ A ii J 

Step 3. Set A+ = q\c h (CC H y l (B H B)~ l B n ]p. 

Figure 4-6 

Combining the above two lemmas we obtain: 

THEOREM 4.28: Every matrix A over C has a unique Moore-Penrose matrix A + . 

There are special cases when A has full-row rank or full-column rank. 

THEOREM 4.29: Let A be a matrix over C. 

(a) If A has full column rank (columns are linearly independent), then 
A+ = (A h A) _ 1 A h . 

(b) If A has full row rank (rows are linearly independent), then A + = A H (AA H ) . 

THEOREM 4.30: Let A be a matrix over C. Suppose A = BC is a full rank factorization of A. Then 

A+ = C + B+ = C H (CC H )~ 1 (B H By 1 B H 
Moreover, AA + = BB' and A + A = C + C. 
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EXAMPLE 4.20 Consider the full rank factorization A = BC in Example D.l; that is, 

= BC 



1 

1 

-1 

2' 


1 

-1" 

r 

A = 

2 

2 

-1 

3 

= 

2 

-1 



-1 

-1 

2 

-3 


-1 

2 

- 


110 1 

0 0 1-1 


Then 


(cc H ) '= 1 


2 1 
1 3 


c(cc H ) '=* 


2 T 

2 1 

1 3 

1 -2 


(5 H 5) — 

V ’ 11 


6 5 
5 6 


Accordingly, the following is the Moore-Penrose inverse of A: 



1 

18 

15 

1 

1 

18 

15 

55 

-2 

19 

25 


3 

-1 

-10 


B(B h B) *= — 
v ’ 11 


1 7 4 

14 7 


4.15 Least-Square Solution 

Consider a system AX = B of linear equations. A least-square solution of AX = B is the vector of smallest 
Euclidean norm that minimizes ||AX — B || 2 . That vector is 

X = A + B 

[In case A is invertible, so A + = A 1 , then X = A 1 B. which is the unique solution of the system.] 

EXAMPLE 4.21 Consider the following system AX = B of linear equations: 
x + y — z + 2 t — 1 
2x + 2y — z + 'it = 3 
—x — y + 2z — 3t — 2 
Then, using Example 4.20, 



1 

-1 





1 

18 

15' 

1 

2' 


V 

1 

1 

18 

15 

2 

2 

-1 

3 

, B = 

3 

, A + = — 

-1 

-1 

2 

-3 

2 

’ 55 

-2 

19 

25 







3 

-1 

-10 


Accordingly, 

X = A + B = (1 /55)[85, 85, 105, -20] T = [17/11, 17/11, 21/11, —4/11] T 
is the vector of smallest Euclidean norm which minimizes ||AX — Z?|| 2 . 


SOLVED PROBLEMS 


Vector Spaces, Linear Combinations 

4.1. Suppose u and v belong to a vector space V. Simplify each of the following expressions: 

(a) E l = 3(2u — 4v) + 5u + 7v, (c) E 3 = 2uv + 3(2u + 4v) 

3 

(b) E 2 — 3u — 6(3n — 5v) + 7n, (d) E 4 — 5u - h 5 m 

v 

Multiply out and collect terms: 

(a) E x = 6 u — \2v + 5 u + 7v = 11 u — 5v 





136 


CHAPTER 4 Vector Spaces 


(b) E 2 = 3 u — 18m + 30r> + lu = —8m + 30v 

(c) E 3 is not defined because the product uv of vectors is not defined. 

(d) E 4 is not defined because division by a vector is not defined. 

4.2. Prove Theorem 4.1: Let V be a vector space over a Held K. 

(i) kO = 0. (ii) 0 u = 0. (iii) If ku = 0, then k = 0 or u = 0. (iv) (—k)u = k(—u) = —kit. 

(i) By Axiom [A 2 ] with u = 0, we have 0 + 0 = 0. Hence, by Axiom [MJ, we have 

k 0 = k{ 0 + 0) = kO + k0 
Adding — kO to both sides gives the desired result. 

(ii) For scalars, 0 + 0 = 0. Hence, by Axiom [M 2 ], we have 

0m = (0 + 0)m = On + On 
Adding —0 m to both sides gives the desired result. 

(iii) Suppose ku = 0 and k ^ 0. Then there exists a scalar A+ 1 such that k~ l k = 1. Thus, 

u = 1 u = ( k~ l k)u = A' -1 (ku) = k~ l 0 = 0 

(iv) Using u + (— u) = 0 and k + (— k ) = 0 yields 

0 = AD = A:[m + (— u)\ = ku + k(—u) and 0 = 0 m = [k + (—A:)]w = ku + (—k)u 

Adding — ku to both sides of the first equation gives —ku = k(—u ), and adding —ku to both sides of the 
second equation gives —ku = (—k)u. Thus, (—k)u = k(—u) = —ku. 

4.3. Show that (a) k(u — v) = ku — kv, (b) u + u = 2 u. 

(a) Using the definition of subtraction, that u — v = u + (—v), and Theorem 4. l(iv), that k(—v) = —kv, we 
have 

k(u — v) = k[u + (—v)] = ku + k(—v) = ku + (—kv) = ku — kv 

(b) Using Axiom [M 4 ] and then Axiom [M 2 ], we have 

u + u = 1m + 1m = (1 + 1)m = 2m 

4.4. Express v — (1, —2,5) in R 3 as a linear combination of the vectors 

«t = (1,M), «2 = (1,2,3), u 3 — (2, —1,1) 

We seek scalars x, y, z, as yet unknown, such that v = xu x + yu 2 + zu 3 . Thus, we require 


r 


V 


T 


2' 

x+ y +2 z = 1 

-2 

= X 

1 


2 

+ z 

-1 

or x + 2 y — z = —2 

5 


1 


3 


1 

x + 3y + z = 5 


(For notational convenience, we write the vectors in R 3 as columns, because it is then easier to find the 
equivalent system of linear equations.) Reducing the system to echelon form yields the triangular system 

* + y + 2z=l, y-3z=-3, 5z = 10 

The system is consistent and has a solution. Solving by back-substitution yields the solution x = —6, y = 3, 
Z = 2. Thus, v = —6u l + 3m 2 + 2m 3 . 

Alternatively, write down the augmented matrix M of the equivalent system of linear equations, where 
Mj, m 2 , m 3 are the first three columns of M and v is the last column, and then reduce M to echelon form: 



'1 

1 

2 

r 


'1 

1 

2 

r 


'1 

1 

2 

r 

M = 

1 

2 

-1 

-2 


0 

1 

-3 

-3 

~ 

0 

1 

-3 

-3 


1 

3 

1 

5 


0 

2 

-1 

4 


0 

0 

5 

10 


The last matrix corresponds to a triangular system, which has a solution. Solving the triangular system by 
back-substitution yields the solution x = —6, y = 3, z = 2. Thus, v = —6 u l + 3 m 2 + 2m 3 . 
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4.5. Express v = (2, —5,3) in R 3 as a linear combination of the vectors 

u \ = (1# —3,2), u 2 = (2,-4,-1), « 3 = (1, —5,7) 

We seek scalars x, y, z, as yet unknown, such that v = xiq + yu 2 + zm 3 . Thus, we require 


2' 


r 


2' 


r 

x + 2y + Z = 2 

-5 

= X 

-3 

+ V 

-4 

+ z 

-5 

or — 3x — 4y — 5z = — 5 

3 


2 


-1 


7 

2x— y+lz= 3 


Reducing the system to echelon form yields the system 

x + 2y + z = 2, 2y — 2z= 1, 0 = 3 

The system is inconsistent and so has no solution. Thus, v cannot be written as a linear combination of 

ll i, M3. 

4.6. Express the polynomial v = t 2 + At — 3 in P(t) as a linear combination of the polynomials 

— t 2 — It + 5, p 2 — 2 1 2 — 3 1 , p 3 = t + 1 

Set v as a linear combination of p { , p 2 , p 2 using unknowns x, y, z to obtain 

t 2 -\~ 41 — 3 = x(r" — 2r + 5) + y(2t~ — 3 1) + z(t T- 1) (*) 

We can proceed in two ways. 

Method 1. Expand the right side of (*) and express it in terms of powers of t as follows: 

t 1 + 4t — 3 = xt 2 — 2 xt + 5x + 2 yt 2 — 3 yt + zt + z 

= (x + 2y)t 2 + (~2x - 3y + z)t + (5x + 3 z) 

Set coefficients of the same powers of t equal to each other, and reduce the system to echelon form. This 
yields 

x+2y=l x + 2 y = I x+2y=l 

— 2 x — 3y+ z= 4 or y + z = 6 or y + z = 6 

5x+3z=—3 —10y + 3z = —8 13z = 52 

The system is consistent and has a solution. Solving by back-substitution yields the solution x = —3, y = 2, 
z = 4. Thus, v = —3pi + 2p 2 + 4 p 2 . 

Method 2. The equation (*) is an identity in r; that is, the equation holds for any value of t. Thus, we can set 
t equal to any numbers to obtain equations in the unknowns. 

(a) Set t = 0 in (*) to obtain the equation —3 = 5x + z. 

(b) Set t = 1 in (*) to obtain the equation 2 = 4x — y + 2 z. 

(c) Set t = — 1 in (*) to obtain the equation —6 = 8x + 5v. 

Solve the system of the three equations to again obtain the solution x = —3, y = 2, z = 4. Thus, 
v = —3pi + 2 p 2 + 4p 3 . 


4.7. Express M as a linear combination of the matrices A, B, C, where 

1 f 

4 5 

Set M as a linear combination of A, B, C using unknown scalars x, y, z; that is, set M = xA + yB + zC. 
This yields 

x + y + z x + 2y + z 
x + 3y + 4z x + 4v + 5z 


4 7 

7 9 

= X 

1 1 

1 1 

+ y 

1 2 

3 4 

+ z 

1 1 

4 5 

= 


M = 


4 7 
7 9 


and 


A = 


1 1 
1 1 


B = 


1 2 

3 4 


C = 
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Form the equivalent system of equations by setting corresponding entries equal to each other: 

x + y + z = 4, x+2y + z = 7, x + 3y + 4z = 7, * + 4y + 5z = 9 

Reducing the system to echelon form yields 

x + y+z = 4, y = 3, 3z=—3, 4z = -4 

The last equation drops out. Solving the system by back-substitution yields z = — 1, y = 3, x = 2. Thus, 
M = 2A + 3B - C. 

Subspaces 

4.8. Prove Theorem 4.2: VP is a subspace of V if the following two conditions hold: 

(a) 0 € W. (b) If u, v E W, then u + v, ku E W. 

By (a), W is nonempty, and, by (b), the operations of vector addition and scalar multiplication are well 
defined for W. Axioms [A]], [A 4 ], [M,], [M 2 ]. [M 3 ], [M 4 ] hold in W because the vectors in W belong to V. 
Thus, we need only show that [A 2 ] and [A 3 ] also hold in W. Now [A 2 ] holds because the zero vector in V 
belongs to W by (a). Finally, if v E W, then (— l)i> = —vE W, and v + (— v) = 0. Thus [A 3 ] holds. 

4.9. Let V = R 3 . Show that W is not a subspace of V, where 

(a) W = {(a, b,c) : a > 0}, (b) W = {(a, £>, c) : a 2 + b 2 + c 2 < 1}. 

In each case, show that Theorem 4.2 does not hold. 

(a) W consists of those vectors whose first entry is nonnegative. Thus, v = (1,2,3) belongs to W. Let 
k = —3. Then kv = (—3,—6,—9) does not belong to W, because —3 is negative. Thus, W is not a 
subspace of V. 

(b) W consists of vectors whose length does not exceed 1. Hence, u = (1,0,0) and v = (0, 1,0) belong to W, 
but u + v = (1,1,0) does not belong to W, because l 2 + l 2 + 0 2 = 2 > 1 . Thus, W is not a subspace of V. 

4.10. Let V = P(f), the vector space of real polynomials. Determine whether or not W is a subspace of V, 
where 

(a) W consists of all polynomials with integral coefficients. 

(b) W consists of all polynomials with degree > 6 and the zero polynomial. 

(c) W consists of all polynomials with only even powers of t. 

(a) No, because scalar multiples of polynomials in W do not always belong to W. For example, 

f( t ) = 3 + 6t + 7t 2 EW but \f(t)=\ + 3t + \t 2 (£W 

(b and c) Yes. In each case, W contains the zero polynomial, and sums and scalar multiples of polynomials 
in W belong to W. 

4.11. Let V be the vector space of functions / : R —> R. Show that W is a subspace of V, where 

(a) W = {f(x) :/(1) = 0}, all functions whose value at 1 is 0. 

(b) W — {/(jc) :/(3) =/(l)}, all functions assigning the same value to 3 and 1. 

(c) W = {/(f) :/(— x) — —f(x) }, all odd functions. 

Let 0 denote the zero function, so 0(.r) = 0 for every value of x. 

(a) Oelf because 0(1) = 0. Suppose/, g EW. Then/( 1) = 0 and g( 1) = 0. Also, for scalars a and b, we have 

(af+bg)( 1) = af{\)+bg{\) = a0 + bO = 0 
Thus, af + bg E W, and hence W is a subspace. 

(b) 6 6 W, because 6(3) = 0 = 6(1). Suppose/g E W. Then/(3) =/( 1) and g(3) = g(l). Thus, for any 
scalars a and b, we have 

(af+bg)( 3) = af{ 3) + bg{3) = af{\) + bg{\) = (af + bg)( 1) 

Thus, af + bg E W, and hence W is a subspace. 
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(c) 6 6 W, because 0(— x) = 0 = —0 = — 6(x). Suppose/ g G W. Then/(— x) = —fix) and g(— x) = —g{x). 
Also, for scalars a and b, 

{of + bg)(-x) = af(-x) + bg(-x) = —af(x) - bg(x) = -(af + bg){x) 

Thus, ab + gf G W, and hence W is a subspace of V. 

4.12. Prove Theorem 4.3: The intersection of any number of subspaces of V is a subspace of V. 

Let {Wf : i G /} be a collection of subspaces of V and let W = n(W ; : i G I). Because each W t is a subspace 
of V, we have 0 6 W h for every i G 7. Hence, 0 G W. Suppose u,v G W. Then »,#G IT,, for every i G 7. Because 
each Wj is a subspace, au + bv G W t , for every i G 7. Hence, au + bv G W. Thus, IT is a subspace of V. 

Linear Spans 

4.13. Show that the vectors u t = (1,1,1), u 2 = (1,2,3), m 3 = (1,5,8) span R 3 . 

We need to show that an arbitrary vector v = {a, b, c) in R 3 is a linear combination of zq, u 2 , u 2 . Set 
v = xui + yu 2 + zu 2 \ that is, set 

{a, b,c) = x(l, 1,1) +y(l,2,3) +z(l,5,8) = (x + y + z, x + 2y + 5z, x+ 3y + 8z) 

Form the equivalent system and reduce it to echelon form: 

x + y + z = a x + y + z = a x + y + z = a 

x + 2y + 5z = b or y + Az = b — a or y + 4z = b — a 

x + 3y + 8z = c 2y + lc = c — a —z = c—2b + a 

The above system is in echelon form and is consistent; in fact, 

x = —a + 5 b — 3c, y = 2>a — lb + 4c, z = a + 2b — c 
is a solution. Thus, iq, u 2 , m 3 span R 3 . 

4.14. Find conditions on a, b, c so that v — (a. h, c) in R 3 belongs to W = span(i< 1 , u 2 , m 3 ), where 

u, = (1,2,0), u 2 = (-1,1,2), u 3 — (3,0, —4) 

Set v as a linear combination of jq, u 2 , m 3 using unknowns x, y, z; that is, set v = xu x + yu 2 + zu 3 . This 
yields 

(a,b,c) = jc( 1,2,0) +y(-l, 1,2) + z(3,0, -4) = (x-y + 3z, 2 x+y, 2y - 4z) 

Form the equivalent system of linear equations and reduce it to echelon form: 

x — y + 3z = a x — y +3 z = a x — y + 3z = a 

2x + y = b or 3y — 6z = b — 2a or 3y — 6 z = b — 2a 

2y — 4z = c 2y — 4z = c 0 = 4a — 2b + 3c 

The vector v = ( a , b, c) belongs to W if and only if the system is consistent, and it is consistent if and only if 

4a — 2b + 3c = 0. Note, in particular, that iq, u 2 , m 3 do not span the whole space R 3 . 

4.15. Show that the vector space V = P(f) of real polynomials cannot be spanned by a finite number of 
polynomials. 

Any finite set S of polynomials contains a polynomial of maximum degree, say m. Then the lineal' span 
span(S) of S cannot contain a polynomial of degree greater than m. Thus, span(S) / V, for any finite set S. 

4.16. Prove Theorem 4.5: Let S be a subset of V. (i) Then span(S) is a subspace of V containing S. 
(ii) If W is a subspace of V containing S, then span (.S’) C W. 

(i) Suppose S is empty. By definition, span(S) = {0}. Hence span(S) = {0} is a subspace of V and 
S C span(S). Suppose S is not empty and v G S. Then v = Id G span),?); hence, S C span),?). Also 
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0 = 0?; G span(S). Now suppose u,w G span(5), say 


(ii) 


u = a l u l + ■ ■ ■ + a r u r = Qi u i 

i 

where u h wj G S and a h bj G K. Then 

u+v=J 2 a,«, + J 2 b j w ] 


and w = b\\\\ + • • • + b s w s = X) bjWj 

j 


and 


ku = k(^ 2 , a i u i \ = X] kajUj 


i j \ « / < 

belong to span(S) because each is a linear combination of vectors in 5. Thus, span(S) is a subspace of V. 

Suppose Ui, u 2 , ■ ■ ., M r G 5. Then all the u t belong to W. Thus, all multiples a l u l ,a 2 u 2 , • • •, a r n r G IT 
and so the sum apq + a 2 u 2 + •• • + a r u r G IT That is, W contains all linear combinations of elements in 
S, or, in other words, span(S) C W, as claimed. 


Linear Dependence 

4.17. Determine whether or not u and v are linearly dependent, where 

(a) u — (1,2), v — (3, — 5), (c) u — (1,2, — 3), v — (4,5, — 6) 

(b) u = ( 1,-3), w = (-2,6), (d) u — (2,4,—8), v— (3,6,—12) 

Two vectors u and v are linearly dependent if and only if one is a multiple of the other, 
(a) No. (b) Yes; for v = —2m. (c) No. (d) Yes, for v = 2 u. 


4.18. Determine whether or not u and v are linearly dependent, where 

(a) u = 2t 2 + At — 3, v = 4t 2 + 8f — 6, (b) u = 2t 2 — 3t + 4, v = 4 1 2 — 3t + 2, 



"l 

3 

-4 


-4 

-12 

16' 


1 

1 

r 


'2 

2 

2 

u = 

_5 

0 

-1 

,v = 

-20 

0 

4 

, (d) u = 

2 

2 

2 

, v = 

_3 

3 

3 _ 


Two vectors u and v are linearly dependent if and only if one is a multiple of the other, 
(a) Yes; for v = 2 u. (b) No. (c) Yes, for v = —4m. (d) No. 


4.19. Determine whether or not the vectors u = (1,1,2), v = (2,3,1), vv = (4,5,5) in R 3 are linearly 
dependent. 

Method 1 . Set a linear combination of u, v, w equal to the zero vector using unknowns x, y, z to obtain 
the equivalent homogeneous system of linear equations and then reduce the system to echelon form. 
This yields 



V 


'2' 


'4' 


'O' 


1 


3 

+ z 

5 

= 

0 


1 


1 


5 


0 


x + 2y + 4z = 0 
or x + 3y + 5z = 0 
2x+ y + 5z = 0 


or 


x + 2y+4z=0 
y + z = 0 


The echelon system has only two nonzero equations in three unknowns; hence, it has a free variable and a 
nonzero solution. Thus, u, v, w are linearly dependent. 

Method 2. Form the matrix A whose columns are u, v, w and reduce to echelon form: 



'12 4' 


'1 2 4' 


04 

A = 

1 3 5 


0 1 1 

~ 

0 1 1 


2 1 5 


O 

1 

u> 

1 

u> 


0 0 0 


The third column does not have a pivot; hence, the third vector w is a linear combination of the first two 
vectors u and v. Thus, the vectors are linearly dependent. (Observe that the matrix A is also the coefficient 
matrix in Method 1. In other words, this method is essentially the same as the first method.) 
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Method 3. Form the matrix B whose rows are u, v, w, and reduce to echelon form: 

'112] [0 1 2] [11 2' 

B=231~01—3~01—3 
_4 5 5 J [o 1 -3j [0 0 0 _ 

Because the echelon matrix has only two nonzero rows, the three vectors are linearly dependent. (The three 
given vectors span a space of dimension 2.) 

4.20. Determine whether or not each of the following lists of vectors in R 3 is linearly dependent: 

(a) Mi = (1,2,5), h 2 = (1,3,1), n 3 = (2,5,7), u 4 = (3,1,4), 

(b) m= (1,2,5), v = (2,5,1), w = (1,5,2), 

(c) u = (1,2,3), v= (0,0,0), w= (1,5,6). 

(a) Yes, because any four vectors in R 3 are linearly dependent. 

(b) Use Method 2 above; that is, form the matrix A whose columns are the given vectors, and reduce the 
matrix to echelon form: 

'121] [1 2 1] [12 1' 

A = 2 5 5 ~ 0 1 3 ~ 0 1 3 

5 1 2 J L° ~ 9 ~ 3 J L° ° 24 _ 

Every column has a pivot entry; hence, no vector is a linear combination of the previous vectors. Thus, 
the vectors are linearly independent. 

(c) Because 0 = (0,0,0) is one of the vectors, the vectors are linearly dependent. 

4.21. Show that the functions f(t) = sin t, g(t) cos t, hit) = t from R into R are linearly independent. 

Set a linear combination of the functions equal to the zero function 0 using unknown scalars x, y, z; that 
is, set xf + yg + zh = 0. Then show x = 0, y = 0, z = 0. We emphasize that xf + yg + zh = 0 means that, for 
every value of t, we have xf(t) + yg{t) + zh(t) = 0. 

Thus, in the equation x sin t + y cos t + zt = 0: 

(i) Set t = 0 to obtain x(0)+y(l) + z(0) = 0 or y = 0. 

(ii) Set f = 7t/2 to obtain jc( 1) + y(0) + zn/2 = 0 or x+nz/2 = 0. 

(iii) Set t = 71 to obtain x(0)+y(—1) + z(7t) = 0 or — y + nz = 0. 

The three equations have only the zero solution; that is, x = 0, y = 0, z = 0. Thus, /, g, h are linearly 
independent. 

4.22. Suppose the vectors u, v, w are linearly independent. Show that the vectors u + v, u — v, 
u — 2v + w are also linearly independent. 

Suppose x(u + v) + y(u — v) + z(u — 2v + w) = 0. Then 

xu + xv + yu — yv + zu — 2 zv + zw = 0 

or 

(x + y + z)u + (x - y - 2z)v + zw = 0 

Because u, v, w are linearly independent, the coefficients in the above equation are each 0; hence, 

x + y + z = 0, x - y - 2z = 0, z= 0 

The only solution to the above homogeneous system is x = 0, y = 0, z = 0. Thus, u + v, u — v, u — 2v + w 
are linearly independent. 

4.23. Show that the vectors u — (1 + i, 2 i) and w — (1, I + i) in C 2 arc linearly dependent over the 
complex field C but linearly independent over the real field R. 
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Recall that two vectors are linearly dependent (over a field K) if and only if one of them is a multiple of 
the other (by an element in K). Because 

(1 + i)w = (1 + 0(1, 1 + 0 = (1 + h 20 = u 

u and w are linearly dependent over C. On the other hand, u and w are linearly independent over R, as no real 
multiple of vv can equal u. Specifically, when k is real, the first component of kw = (k, k + ki) must be real, 
and it can never equal the first component 1 + i of u, which is complex. 

Basis and Dimension 

4.24. Determine whether or not each of the following form a basis of R 3 : 

(a) (1,1,1), (1,0,1); (c) (1,1,1), (1,2,3), (2,-1,1); 

(b) (1,2,3), (1,3,5), (1,0,1), (2,3,0); (d) (1,1,2), (1,2,5), (5,3,4). 

(a and b) No, because a basis of R 3 must contain exactly three elements because dimR 3 = 3. 

(c) The three vectors form a basis if and only if they are linearly independent. Thus, form the matrix whose 
rows are the given vectors, and row reduce the matrix to echelon form: 


'1 

1 

r 


'1 

1 

r 


'1 

1 

f 

1 

2 

3 


0 

1 

2 

~ 

0 

1 

2 

2 

-1 

1 


0 

-3 

-1 


0 

0 

5 


The echelon matrix has no zero rows; hence, the three vectors are linearly independent, and so they do 
form a basis of R 3 . 

(d) Fomi the matrix whose rows are the given vectors, and row reduce the matrix to echelon form: 


'1 

1 

2' 


‘1 

1 

2' 


'1 

1 

2' 

1 

2 

5 

~ 

0 

1 

3 


0 

1 

3 

5 

3 

4 


0 

-2 

-6 


0 

0 

0 


The echelon matrix has a zero row; hence, the three vectors are linearly dependent, and so they do not 
form a basis of R 3 . 

4.25. Determine whether (1,1,1,1), (1,2,3,2), (2,5,6,4), (2,6, 8,5) form a basis of R 4 . If not, find 
the dimension of the subspace they span. 

Form the matrix whose rows are the given vectors, and row reduce to echelon form: 


B = 


The echelon matrix has a zero row. Hence, the four vectors are linearly dependent and do not form a basis of 
R 4 . Because the echelon matrix has three nonzero rows, the four vectors span a subspace of dimension 3. 

4.26. Extend {rr x = (1.1,1,1 ),u 2 = (2,2,3,4)} to a basis of R 4 . 

First form the matrix with rows iq and u 2 , and reduce to echelon form: 


1 

1 

1 

r 


'1 

1 

1 

f 


'1 

1 

1 

r 


'i 

1 

1 

r 

1 

2 

3 

2 


0 

1 

2 

1 


0 

1 

2 

i 


0 

1 

2 

i 

2 

5 

6 

4 


0 

3 

4 

2 


0 

0 

-2 

-i 


0 

0 

2 

i 

2 

6 

8 

5 


0 

4 

6 

3 


0 

0 

-2 

-i 


0 

0 

0 

0 


'i 

ill' 


1111' 

2 

2 3 4 


0 0 12 


Then wq = (1,1,1,1) and w 2 = (0,0,1,2) span the same set of vectors as spanned by jq and u 2 . Let 
m 3 = (0,1,0,0) and u 4 = (0,0,0, 1). Then wq, u 3 , vv 2 , u 4 form a matrix in echelon fomi. Thus, they are 
linearly independent, and they form a basis of R 4 . Hence, tq, u 2 , m 3 , u 4 also form a basis of R 4 . 

4.27. Consider the complex field C, which contains the real field R, which contains the rational field Q. 
(Thus, C is a vector space over R, and R is a vector space over Q.) 

(a) Show that {1, /} is a basis of C over R; hence, C is a vector space of dimension 2 over R. 

(b) Show that R is a vector space of infinite dimension over Q. 
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(a) For any v G C, we have v = a + bi = a( 1) + b(i), where a, b G R. Hence, {1, /} spans C over R. 
Furthermore, if jc( 1) + y(i) = 0 or x + yi = 0, where x, y G R. then x = 0 and y = 0. Hence, {1,;} is 
linearly independent over R. Thus, {1, z} is a basis for C over R. 

(b) It can be shown that n is a transcendental number; that is, n is not a root of any polynomial over Q. 
Thus, for any n, the n + 1 real numbers 1, n, 7i 2 ,..., n" are linearly independent over Q. R cannot be of 
dimension n over Q. Accordingly, R is of infinite dimension over Q. 


4.28. Suppose S = {u l ,u 2 , ..., u n } is a subset of V. Show that the following Definitions A and B of a 
basis of V are equivalent: 

(A) S is linearly independent and spans V. 

(B) Every v G T is a unique linear combination of vectors in S. 

Suppose (A) holds. Because S spans V, the vector v is a linear combination of the n ; , say 
v = a, M| + a 2 u 2 + • •• + a n ii lt and v = Zqzz, + b 2 u 2 + • • • + b n u n 
Subtracting, we get 

0 = v - v = (a x - b l )u 1 + (a 2 - b 2 )u 2 H-F (a„ - b n )u n 

But the are linearly independent. Hence, the coefficients in the above relation are each 0: 

a l -b l =0 , a 2 - b 2 = 0, ..., a n -b n = 0 

Therefore, a x = b l ,a 2 = b 2 ,... ,a n = b n . Hence, the representation of v as a linear combination of the u t is 
unique. Thus, (A) implies (B). 

Suppose (B) holds. Then S spans V. Suppose 

0 = c x u x + c 2 u 2 + • • • + c n u n 

However, we do have 

0 — On i On 1 T * ■ ■ T 0 u n 

By hypothesis, the representation of 0 as a linear combination of the u t is unique. Hence, each c ; = 0 and the 
Uj are linearly independent. Thus, (B) implies (A). 

Dimension and Subspaces 

4.29. Find a basis and dimension of the subspace W of R 3 where 

(a) W = {(a, b, c) : a + b + c = 0}, (b) W = {(a, b,c) : (a = b — c)} 

(a) Note that W ^ R 3 , because, for example, (1,2,3) 0 W. Thus, dim IT < 3. Note that u x = (1,0, —1) 
and u 2 = (0,1, — 1) are two independent vectors in W. Thus, dim IT = 2, and so zq and u 2 form a basis 
ofir" 

(b) The vector u = (1,1, 1) G IT Any vector w G IT has the form w = (k,k,k). Hence, w = kit. Thus, u 
spans IT and dim IT = 1. 


W be the subspace of R 4 spanned by the vectors 

Mi = (1,-2,5,-3), u 2 = (2,3,1,-4), « 3 = (3,8,-3,-5) 

Find a basis and dimension of W. (b) Extend the basis of IT to a basis of R 4 . 

Apply Algorithm 4.1, the row space algorithm. Form the matrix whose rows are the given vectors, and 
reduce it to echelon form: 



'1 

-2 

5 

-3' 


'1 

-2 

5 

-3' 


'1 

-2 

5 

-3' 

A = 

2 

3 

1 

-4 


0 

7 

-9 

2 


0 

7 

-9 

2 


3 

8 

-3 

-5 


0 

14 

-18 

4 


0 

0 

0 

0 


The nonzero rows (1, —2,5, —3) and (0, 7, —9,2) of the echelon matrix form a basis of the row space of 
A and hence of IT Thus, in particular, dim IT = 2. 

(b) We seek four linearly independent vectors, which include the above two vectors. The four vectors 
(1,—2,5,—3), (0,7,—9,2), (0,0,1,0), and (0,0,0,1) are linearly independent (because they form an 
echelon matrix), and so they form a basis of R 4 , which is an extension of the basis of IT 


4.30. Let 


(a) 

(a) 
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4.31. Let W be the subspace of R 5 spanned by u 2 = (1,2, —1,3,4), u 2 = (2,4, —2,6,8), 
u 3 = (1,3,2,2,6), m 4 = (1,4,5,1,8), u 5 = (2,7,3,3,9). Find a subset of the vectors that 
form a basis of W. 

Here we use Algorithm 4.2, the casting-out algorithm. Form the matrix M whose columns (not rows) are 
the given vectors, and reduce it to echelon form: 


1 

2 

1 

1 

2' 


'1 

2 

1 

1 

2' 


'1 

2 

1 

1 

2' 

2 

4 

3 

4 

7 


0 

0 

1 

2 

3 


0 

0 

1 

2 

3 

-1 

-2 

2 

5 

3 


0 

0 

3 

6 

5 


0 

0 

0 

0 

-4 

3 

6 

2 

1 

3 


0 

0 

-1 

-2 

-3 


0 

0 

0 

0 

0 

4 

8 

6 

8 

9 


0 

0 

2 

4 

1 


0 

0 

0 

0 

0 


The pivot positions are in columns Cj, C 3 , C 5 . Hence, the corresponding vectors m,, m 3 , m 5 form a basis of W, 
and dim W = 3. 


4.32. Let V be the vector space of 2 x 2 matrices over K. Let W be the subspace of symmetric matrices. 
Show that dim W = 3, by finding a basis of W. 

Recall that a matrix A = [a t] ] is symmetric if A r = A, or, equivalently, each ay = a jr Thus, A = 

denotes an arbitrary 2x2 symmetric matrix. Setting (i) a = 1, b = 0, d = 0; (ii) a = 0, b = 1, d = 0; 
(iii) a = 0, = 0, d = 1, we obtain the respective matrices: 


a b 
b d 


£, = 


1 o 
o o 


E 2 = 


0 1 
1 0 


E,= 


0 0 

0 1 


We claim that S = {E l ,E 2 ,E 3 } is a basis of W\ that is, (a) S spans W and (b) S is linearly independent. 

a b 


(a) The above matrix A = 


b d 


= aE l + bE 2 + dE 2 . Thus, S spans W. 


(b) Suppose xE i + yE 2 + z£ 3 = 0, where x, y, z are unknown scalars. That is, suppose 


1 0 
0 0 


+ .v 


0 1 
1 0 


+ z 


0 0 
0 1 


0 0 
0 0 


x y 

y z 


0 0 
0 0 


Setting corresponding entries equal to each other yields x = 0, y = 0, z = 0. Thus, S is linearly independent. 
Therefore, S is a basis of W, as claimed. 


Theorems on Linear Dependence, Basis, and Dimension 

4.33. Prove Lemma 4.10: Suppose two or more nonzero vectors v l , v 2 , ■ ■ ■, v m are linearly dependent. 
Then one of them is a linear combination of the preceding vectors. 

Because the v t are linearly dependent, there exist scalars a x ,...,a m , not all 0, such that 
a l v l + • • • + a nl v m = 0. Let k be the largest integer such that a k ^ 0. Then 

a l v 1 + ---+a k v k +0v k+l + --- + 0v m = 0 or a l v l + ■ ■ ■ + a k v k = 0 
Suppose k = 1; then a l v l = 0, a x ^ 0, and so = 0. But the v t are nonzero vectors. Hence, k > I and 

\ - 

That is, v k is a linear combination of the preceding vectors. 


4.34. Suppose S = {v l ,v 2 ,..., v m } spans a vector space V. 

(a) If w G V. then {w, tq,..., v ,„} is linearly dependent and spans V. 

(b) If Vj is a linear combination of v 1 ,..., v i _ 1 , then S without v [ spans V. 

(a) The vector w is a linear combination of the v h because {} spans V. Accordingly, { w , v m } is 

linearly dependent. Clearly, w with the v t span V, as the v t by themselves span V\ that is, { w, v lt ... ,v m j 
spans V. 
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(b) Suppose v t = k\Vi + • • • + &;_i v i_\ • Let u G V! Because {v,} spans V, u is a linear combination of the 

Vj’s, say u = aqtq + • • • + a m v m . Substituting for v ; , we obtain 

u = a l v 1 + --- + «,„!«,_! + a,-(Mt + • • • + k ( _ it) M ) + a i+1 v i+t + • • • + a m v m 

— (^l + a ik\)V\ -\ -f (a,-_i + ciikj^Vi-x + a i+l v i+l +-b a m v m 

Thus, {tq,..., Vj_i, v i+ v m } spans V. In other words, we can delete v t from the spanning set and still 
retain a spanning set. 

4.35. Prove Lemma 4.13: Suppose {tq, v 2 ,..., v„} spans V, and suppose {w l ,w 2 , ■ ■ ■ ,w m j is linearly 
independent. Then m < n. and V is spanned by a set of the form 

{w u w 2 ,...,w m , v h ,v i2 ,...,v in J 

Thus, any n + 1 or more vectors in V are linearly dependent. 

It suffices to prove the lemma in the case that the v t are all not 0. (Prove!) Because {v,} spans V, we have 
by Problem 4.34 that 

{w u v l ,...iV n } (1) 

is linearly dependent and also spans V. By Lemma 4.10, one of the vectors in (1) is a linear combination of the 
preceding vectors. This vector cannot be w 1 , so it must be one of the v’s, say Vj. Thus by Problem 4.34, we 
can delete Vj from the spanning set (1) and obtain the spanning set 

{w u v u ..., Vj_ u v j+l ,...,v n } (2) 

Now we repeat the argument with the vector w 2 . That is, because (2) spans V, the set 

{w x ,w 2 ,v u ... 1 v j _ u v J+u ...,v n } (3) 

is linearly dependent and also spans V. Again by Lemma 4.10, one of the vectors in (3) is a linear combination 
of the preceding vectors. We emphasize that this vector cannot be w l or w 2 , because {w l7 ... ,w m } is 
independent; hence, it must be one of the v’s, say v k . Thus, by Problem 4.34, we can delete v k from the 
spanning set (3) and obtain the spanning set 

{w u w 2 ,v 1 ,...,v J _ 1 , v J+u ...,v k _ u v k+u ...,v n } 

We repeat the argument with w 3 , and so forth. At each step, we are able to add one of the w’s and delete 
one of the v’s in the spanning set. If m < n, then we finally obtain a spanning set of the required form: 

{wi,...,w,„, V-’VJ 

Finally, we show that m > n is not possible. Otherwise, after n of the above steps, we obtain the spanning 
set {w 1( ..., w„}. This implies that w„ +1 is a linear combination of w x ,... ,w n , which contradicts the 
hypothesis that {w,} is linearly independent. 

4.36. Prove Theorem 4.12: Every basis of a vector space V has the same number of elements. 

Suppose {ui,u 2 ,... , u n } is a basis of V, and suppose {w 1; v 2 ,...} is another basis of V. Because {«,•} 
spans V, the basis {iq, v 2 , ■ ■.} must contain n or less vectors, or else it is linearly dependent by 
Problem 4.35—Lemma 4.13. On the other hand, if the basis {iq, v 2 , ■ ■.} contains less than n elements, 
then {«!, u 2 ,..., u n } is linearly dependent by Problem 4.35. Thus, the basis {iq, v 2 ,.. .} contains exactly n 
vectors, and so the theorem is true. 

4.37. Prove Theorem 4.14: Let V be a vector space of finite dimension n. Then 

(i) Any n + 1 or more vectors must be linearly dependent. 

(ii) Any linearly independent set S = {u { ,u 2 ,... u n } with n elements is a basis of V. 

(iii) Any spanning set T — {tq, v 2 , ■ ■ ■, v n } of V with n elements is a basis of V. 

Suppose B = {wq ,w 2 ,... ,w n } is a basis of V. 

(i) Because B spans V, any n + I or more vectors are linearly dependent by Lemma 4.13. 
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(ii) By Lemma 4.13, elements from B can be adjoined to S to form a spanning set of V with n elements. 
Because S already has n elements, S itself is a spanning set of V. Thus, S is a basis of V. 

(iii) Suppose T is linearly dependent. Then some v i is a linear combination of the preceding vectors. By 
Problem 4.34, V is spanned by the vectors in T without v t and there are n — 1 of them. By Lemma 4.13, 
the independent set B cannot have more than n — 1 elements. This contradicts the fact that B has n 
elements. Thus, T is linearly independent, and hence T is a basis of V. 


4.38. Prove Theorem 4.15: Suppose S spans a vector space V. Then 

(i) Any maximum number of linearly independent vectors in S form a basis of V. 

(ii) Suppose one deletes from S every vector that is a linear combination of preceding vectors in S. 
Then the remaining vectors form a basis of V. 

(i) Suppose v m } is a maximum linearly independent subset of S , and suppose w G S. Accordingly, 

{v k ,... ,v m ,w} is linearly dependent. No v k can be a linear combination of preceding vectors. Hence, w 
is a linear combination of the i> ; . Thus, w G span(i> ; ), and hence S C span(u ; ). This leads to 

V = span(S') C span(t/,) C V 

Thus, {z>,} spans V, and, as it is linearly independent, it is a basis of V. 

(ii) The remaining vectors form a maximum linearly independent subset of 5; hence, by (i), it is a basis of V. 


4.39. Prove Theorem 4.16: Let V be a vector space of finite dimension and let S = {ttj, u 2 , ■ . ■, u r } be a 
set of linearly independent vectors in V. Then S is part of a basis of V; that is, S may be extended to 
a basis of V. 

Suppose B = {w'! , w 2 ,..., w n } is a basis of V. Then B spans V, and hence V is spanned by 

SUB= {u u u 2 ,...,u r: w l ,w 2 ,...,w n } 

By Theorem 4.15, we can delete from SUB each vector that is a linear combination of preceding vectors to 
obtain a basis B' for V. Because S is linearly independent, no u k is a linear combination of preceding vectors. 
Thus, B' contains every vector in S, and S is part of the basis B' for V. 


4.40. Prove Theorem 4.17: Let IT be a subspace of an //-dimensional vector space V. Then dim W < n. In 
particular, if dim W = n, then W=V. 

Because V is of dimension n, any n + 1 or more vectors are linearly dependent. Furthermore, because a 
basis of W consists of linearly independent vectors, it cannot contain more than n elements. Accordingly, 
dim W< n. 

In particular, if {w t ,..., w„} is a basis of W, then, because it is an independent set with n elements, it is 
also a basis of V. Thus, W = V when dim W = n. 


Rank of a Matrix, Row and Column Spaces 


4.41. Find the rank and basis of the row space of each of the following matrices: 


(a) A — 




'1 3 

1 -2 -3" 

~ 1 2 0 -1" 

(b) B - 

1 4 

3-1 -4 

2 6-3-3 

3 10 -6 -5 

2 3 

-4 -7 -3 



_3 8 

1 

1 

OC 


(a) Row reduce A to echelon form: 


'1 

2 

0 

- 1 " 


'1 

2 

0 

- 1 " 

0 

2 

-3 

-1 


0 

2 

-3 

-1 

0 

4 

-6 

-2 


0 

0 

0 

0 


The two nonzero rows (1,2,0, —1) and (0, 2, —3, —1) of the echelon form of A form a basis for rowsp 
(A). In particular, rank(A) = 2. 
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(b) Row reduce B to echelon form: 

'1 3 1-2-31 [13 1-2 -3' 

0 12 1-1 0 12 1-1 

° ~ 0 —3 —6 —3 3 ~ 0 0 0 0 0 

0 -1 -2 -1 lj L° 0 0 0 

The two nonzero rows (1,3,1, —2, —3) and (0,1,2,1, —1) of the echelon form of B form a basis for 
rowsp(B). In particular, rank(S) = 2. 

4.42. Show that U = W, where U and W are the following subspaces of R 3 : 

U = span(M 1 ,M 2 ,M 3 ) = span(l, 1,-1), (2,3,-1), (3,1,-5)} 

W = span(w 1 ,w 2 ,w 3 ) = span(l,-1,-3), (3,-2,-8), (2,1,-3)} 

Form the matrix A whose rows are the u h and row reduce A to row canonical form: 

"1 1 -1] [1 1 -1] [1 0 -2 

A= 2 3 -1 ~ 0 1 1 ~ 0 1 1 

_3 1 -5J L° -2 - 2 J |_° 0 0 

Next form the matrix B whose rows are the Wj, and row reduce B to row canonical form: 

'1 -1 -3] [1 -1 -3 "I [1 0 -2' 

5=3—2—8 ~0 1 1 ~ 0 1 1 

_2 1 -3j [0 3 3j [0 0 0_ 

Because A and 5 have the same row canonical form, the row spaces of A and 5 are equal, and so U = W. 

12 1 2 3 r 

2 4 3 7 7 4 

1 2 2 5 5 6 ' 

3 6 6 15 14 15_ 

(a) Find rank (M k ), for k — 1,2, ... ,6, where M k is the submatrix of A consisting of the first k 
columns C,, C 2 ,..., C k of A. 

(b) Which columns C k+] are linear combinations of preceding columns C k , ..., Q ? 

(c) Find columns of A that form a basis for the column space of A. 

(d) Express column C 4 as a linear combination of the columns in part (c). 

(a) Row reduce A to echelon form: 

'12 12 3 ll [12 12 3 1' 

00131 2 001312 

00132 5 ~ 0 0 0 0 1 3 

0 0 3 9 5 12j [ 00000 ° 

Observe that this simultaneously reduces all the matrices M k to echelon form; for example, the first four 
columns of the echelon form of A are an echelon form of M 4 . We know that rank(M A .) is equal to the 
number of pivots or, equivalently, the number of nonzero rows in an echelon form of M k . Thus, 

rank)./!^) = rank(A/ 2 ) = 1, rank(M 3 ) = rank (Af 4 ) = 2 

rank(M 5 ) = rank(M 6 ) = 3 

(b) The vector equation x l Ci + x 2 C 2 + • • • + x k C k = C A+1 yields the system with coefficient matrix M k and 
augmented M k+l . Thus, C k+l is a linear combination of C l ,...,C k if and only if 
rank (M k ) = rank(M A+1 ) or, equivalently, if C A+1 does not contain a pivot. Thus, each of C 2 , C 4 , C 6 
is a linear combination of preceding columns. 

(c) In the echelon form of A, the pivots are in the first, third, and fifth columns. Thus, columns Cj, C 3 , C 5 of 
A form a basis for the columns space of A. Alternatively, deleting columns C 2 , C 4 , C 6 from the spanning 
set of columns ( they are linear combinations of other columns), we obtain, again, C I; C 3 , C 5 . 


4.43. Let A = 
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(d) The echelon matrix tells us that C 4 is a linear combination of columns C x and C 3 . The augmented matrix 
M of the vector equation C 4 = xC x + vC 2 consists of the columns C t , C 3 , C 4 of A which, when reduced 
to echelon form, yields the matrix (omitting zero rows) 


1 1 2 

0 1 3 


or 


x + y = 2 
y = 3 


or x = — 1. y = 3 


Thus, C 4 = -C x + 3C 3 = -C, + 3C 3 + 0C 5 . 


4.44. Suppose u — (a 1 ,a 2 , ■ ■ ■ ,a n ) is a linear combination of the rows R l ,R 2 ,...,R m of a matrix 

B = [by], say u — k x R { + k 2 R 2 4-+ k m R m . Prove that 

a-i = k x b u + k 2 b 2i 4-b k m b mi , i=l,2,...,n 

where b Xj ,b 2i ,, b mi are the entries in the /th column of B. 

We are given that u = k l R l 4- k 2 R 2 4- • • • 4- k m R m . Hence, 

(«i, eh., • • ■, a„) = k x (b n ,b ln ) + • • • + k m (b mU b mn ) 

= (k x b u H-b k m b m i ,...,k x b ln 4 -b k m b mn ) 

Setting corresponding components equal to each other, we obtain the desired result. 

4.45. Prove Theorem 4.7: Suppose A= [aJ and B= [by] are row equivalent echelon matrices with 
respective pivot entries 

ay,, «2 j 2 , • • •, a rj r and b i k t » b 2 k 2 ,■■■, b sk s 

(pictured in Fig. 4-7). Then A and B have the same number of nonzero rows — that is, r = s — and 
their pivot entries are in the same positions; that is, /, = k { , / 2 — k 2 ,...= k r . 



* * * * 

* 

* 


z?i k { * * * * 

* 

* 

A = 

a 2 j 2 * * 

* 

* 

, B = 

b 2 k 2 * * 

* 

* 


a rj. 

* 

* 


1 

Vi 

* 

* 


Figure 4-7 

Clearly A = 0 if and only if B = 0, and so we need only prove the theorem when r > 1 and 5 > 1. We 
first show that j } = k x . Suppose /) < k x . Then the yph column of B is zero. Because the first row R* of A is in 
the row space of B, we have R* = c , R , + c x R 2 4- • • • 4- c m R m , where the R, are the rows of B. Because the 
/', th column of B is zero, we have 

ci | 2| = CjO 4~ c 2 0 -b • • • 4- c m 0 = 0 

But this contradicts the fact that the pivot entry a i j i ^4 0. Hence, j x > k t and, similarly, k t > j ] . Thus j x = k x . 

Now let A' be the submatrix of A obtained by deleting the first row of A, and let B' be the submatrix of B 
obtained by deleting the first row of B. We prove that A' and B' have the same row space. The theorem will 
then follow by induction, because A! and B' are also echelon matrices. 

Let R = (a 1 ,a 2 , ■.. ,a„) be any row of A' and let . ,R m be the rows of B. Because R is in the row 
space of B, there exist scalars d x ,,d m such that R = d l R l + d 2 R 2 4- — • 4- d m R m . Because A is in echelon 
form and R is not the first row of A, the jjth entry of R is zero: a, = 0 for i = /, = k x . Furthermore, because B 
is in echelon form, all the entries in the k x th column of B are 0 except the first: b lk ^ 0, but 
b 2 k } = 0, • ■ -, b mki = 0. Thus, 

0 = a kl = d l b lki + d 2 0 4-b d m 0 = d x b lki 

Now b lki ^4 0 and so d x = 0. Thus, R is a linear combination of R 2 ,..., R m and so is in the row space of B'. 
Because R was any row of A', the row space of A' is contained in the row space of S'. Similarly, the row space 
of B' is contained in the row space of A'. Thus, A' and S' have the same row space, and so the theorem is 
proved. 
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4.46. Prove Theorem 4.8: Suppose A and B are row canonical matrices. Then A and B have the same row 
space if and only if they have the same nonzero rows. 

Obviously, if A and B have the same nonzero rows, then they have the same row space. Thus we only 
have to prove the converse. 

Suppose A and B have the same row space, and suppose R ^ 0 is the ith row of A. Then there exist 
scalars cq,..., c s such that 

R = c l R l + c 2 R 2 + ■ • • + c s R s (1) 

where the B ; are the nonzero rows of B. The theorem is proved if we show that R = Rf, that is, that c ; = 1 but 
c k = 0 for k ^ i. 

Let cijj, be the pivot entry in R —that is, the first nonzero entry of R. By (1) and Problem 4.44, 

a iji = c \b \/, + c ibij i 4-F c s b s j. (2) 

But, by Problem 4.45, by. is a pivot entry of B, and, as B is row reduced, it is the only nonzero entry in the jth 
column of B. Thus, from (2), we obtain a t j = Cjby . However, ay = 1 and b^ = 1, because A and B are row 
reduced; hence, c ; = 1. 

Now suppose k ^ /, and b k j t is the pivot entry in R k . By (1) and Problem 4.44, 

°ih = c i b \j t + c 2 b 2h + • • • + c s b sjk (3) 

Because B is row reduced, by is the only nonzero entry in the /th column of B. Hence, by (3), a (/j = c k b kh . 
Furthermore, by Problem 4.45, a k j t is a pivot entry of A , and because A is row reduced, a Ih = 0. Thus, 
c k b k j t = 0, and as b k - h = 1, c k = 0. Accordingly R = R r and the theorem is proved. 

4.47. Prove Corollary 4.9: Every matrix A is row equivalent to a unique matrix in row canonical form. 

Suppose A is row equivalent to matrices A x and A 2 , where Aj and A 2 are in row canonical form. Then 
rowsp(A) = rowsp(Aj) androwsp(A) = rowsp(A 2 ). Hence, rowsp(Aj) = rowsp(A 2 ). Because Aj andA 2 are 
in row canonical form, Aj = A 2 by Theorem 4.8. Thus, the corollary is proved. 

4.48. Suppose RB and AB are defined, where R is a row vector and A and B are matrices. Prove 

(a) RB is a linear combination of the rows of B. 

(b) The row space of AB is contained in the row space of B. 

(c) The column space of AB is contained in the column space of A. 

(d) If Cis a column vector and AC is defined, then ACis a linear combination of the columns of A. 

(e) ran k (A B) < rank(5) and rank(Afi) < rank (A). 

(a) Suppose R = (a l ,a 2 ,..., a m ) and B = [by ]. Let B 1 ,..., B m denote the rows of B and B 1 .... . B" its 
columns. Then 

RB= (RB'.RB 1 ...., RB n ) 

= ( a \b\\ + a 2 b 2 \ 4- 'r a mb m i> *.••> a \b\ n + a 2 b 2 „ + - k a m b mn ) 

= a i( b n,b n ,..., b ln ) + a 2 (b 21 ,b 22 ,..., b 2n ) + • • • + a m (b ml , b m2 ,..., b mn ) 

= UjB] + a 2 B 2 4-4 a ,„b m 

Thus, RB is a linear combination of the rows of B. as claimed. 

(b) The rows of AB are R t B, where is the ith row of A. Thus, by part (a), each row of AB is in the row 
space of B. Thus, rowsp(AB) C rowsp(B), as claimed. 

(c) Using part (b), we have colsp(AB) = rowsp(AB) r = rowsp(B r A r ) C rowsp(A r ) = colsp(A). 

(d) Follows from (c) where C replaces B. 

(e) The row space of AB is contained in the row space of B; hence, rank(AB) < rank(B). Furthermore, the 
column space of AB is contained in the column space of A; hence, rank(AB) < rank(A). 
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4.49. Let A be an 72 -square matrix. Show that A is invertible if and only if rank(A) = n. 

Note that the rows of the 77-square identity matrix /,, are linearly independent, because /„ is in echelon 
form; hence, rank(/„) = n. Now if A is invertible, then A is row equivalent to /„; hence, rank(A) = n. But if A 
is not invertible, then A is row equivalent to a matrix with a zero row; hence, rank(A) < n\ that is, A is 
invertible if and only if rank (A) = n. 


Applications to Linear Equations 

4.50. Find the dimension and a basis of the solution space W of each homogeneous system: 

x + 2y + 2z — s + 3t = 0 x + 2y + z — 2t — 0 x + y + 2z = 0 

x + 2y 4- 3z + s + t — 0 2x + 4y + 4z — 3t = 0 2x+ 3_y + 3z = 0 

3x + 6y + 8z + s + 5t — 0 3x + 6y + 7z — 4t — 0 x + 3y + 5z = 0 

(a) ' (b) ’(c) 

(a) Reduce the system to echelon form: 

x + 2y + 2z — s + 3r = 0 x + 2y + 2z — s + 3r = 0 

z + 2s — 2t = 0 or z+ 2s — 2t = 0 

2z + 4s — 4t = 0 

The system in echelon fomi has two (nonzero) equations in five unknowns. Hence, the system has 
5 — 2 = 3 free variables, which are y, s , t. Thus, dim IT = 3. We obtain a basis for W: 

(1) Set y = 1,5 = 0, t = 0 to obtain the solution rq = (—2,1,0,0,0). 

(2) Set y = 0,5 = 1, t = 0 to obtain the solution Z7 2 = (5,0, —2, 1,0). 

(3) Set v = 0,s = 0, f = 1 to obtain the solution r> 3 = (—7,0,2,0, 1). 

The set {v l ,v 2 ,v 3 } is a basis of the solution space W. 

(b) (Here we use the matrix format of our homogeneous system.) Reduce the coefficient matrix A to echelon 
form: 

"121 -2] [121 -2] [1 2 1 -2" 

A = 244 —3 ~ 002 1~002 1 

3 6 7 - 4 J L° 0 4 2 J L° 0 0 °_ 

This corresponds to the system 

x + 2y + 2z — 2t = 0 
2 z+ 7 = 0 

The free variables are v and 7 , and dim W = 2. 

(i) Set v = 1, z = 0 to obtain the solution u x = (—2,1,0,0). 

(ii) Set y = 0, z = 2 to obtain the solution u 2 = (6,0, —1,2). 

Then {u l , u 2 } is a basis of W. 

(c) Reduce the coefficient matrix A to echelon form: 

"1 1 2] [11 21 [11 2" 

A=233~01 — 1 ~ 0 1 -1 

1 3 5 J [0 2 3j [0 0 5 _ 

This corresponds to a triangular system with no free variables. Thus, 0 is the only solution; that is, 
W = {0}. Hence, dim IT = 0. 

4.51. Find a homogeneous system whose solution set IT is spanned by 

{« 1) m 2 ,m 3 } = {(1,-2,0,3), (1,-1,-1,4), (1,0,-2,5)} 
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Let v = (x,y,z, t). Then v E W if and only if v is a linear combination of the vectors iq, u 2 , u 2 that span 
IT Thus, form the matrix M whose first columns are u t , u 2 , u 3 and whose last column is v, and then row 
reduce M to echelon form. This yields 

'1 1 lx] f 1 1 1 x 1 I" 1 1 1 x 

—2—1 Oy 0 1 2 2x + y 012 2x + y 

0 -1 -2 z ~ 0 -1 -2 z 0 0 0 2x + y + z 

3 4 5 t 0 12 —3 x + t 0 0 0 —5 x — y+t 

Then v is a linear combination of u l , u 2 , n 3 if rank(M) = rank(A), where A is the submatrix without column 
v. Thus, set the last two entries in the fourth column on the right equal to zero to obtain the required 
homogeneous system: 

2 x + y + z =0 
5x + y — t = 0 

4.52. Let x ix , x,- ,..., x t be the free variables of a homogeneous system of linear equations with n 
unknowns. Let Vj be the solution for which x ; = 1, and all other free variables equal 0. Show that 
the solutions v 1 ,v 2 , ■ ■ ■ ,v k are linearly independent. 

Let A be the matrix whose rows are the i/,. We interchange column 1 and column q, then column 2 and 
column i 2 ,.... then column k and column i k , and we obtain the k x n matrix 

1 0 0 ... 0 0 c xk+ 1 ... c ln 

B =[IX]= 0 1 0 ... 0 0 C 2MI ... c 2n 

0 0 0 ... 0 1 c kk+l ... c kn _ 

The above matrix B is in echelon form, and so its rows are independent; hence, rank(B) = k. Because A and B 
are column equivalent, they have the same rank—rank(A) = k. But A has k rows; hence, these rows (i.e., the 
Vj) are linearly independent, as claimed. 

Sums, Direct Sums, Intersections 

4.53. Let U and W be subspaces of a vector space V. Show that 

(a) U + V is a subspace of V. 

(b) U and W are contained in U + W. 

(c) U + W is the smallest subspace containing U and W: that is, U + W — span(t/, W). 

(d) W + W = W. 

(a) Because U and W are subspaces, 0 G U and 0 G W. Hence, 0 = 0 + 0 belongs to U + W. Now suppose 
v,v'£ U + W. Then v = u + w and v 1 = u’ + i/, where u, u! G U and w, w' G W. Then 

av + bv' = (au + bu') + (aw + bw') € U + W 

Thus, U + W is a subspace of V. 

(b) Let u E U. Because W is a subspace, 0 E W. Hence, u = u + 0 belongs to U + W. Thus, U C U + W. 
Similarly, W C U + W. 

(c) Because U + W is a subspace of V containing U and W, it must also contain the linear span of U and W. 
That is, span( U, W) C U + W. 

On the other hand, if v E U + W, then v = u + w = 1 u+ 1 w, where u E U and w E W. Thus, v is a 

linear combination of elements in U U W, and so v E span({/, W). Hence, U + W C span({/, W). 

The two inclusion relations give the desired result. 

(d) Because IT is a subspace of V, we have that IT is closed under vector addition; hence, IT + IT C IT By 
part (a), IT C IT + IT Hence, W + IT = IT 

4.54. Consider the following subspaces of R 5 : 

U = span(M 1 ,M 2 ,M 3 ) = span{(l,3, —2,2,3), (1,4,-3,4,2), (2,3,-1,-2,9)} 

IT = span(vv 1 ,w 2 ,w 3 ) = span{(l, 3,0,2,1), (1,5,-6,6,3), (2,5,3,2,1)} 
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Find a basis and the dimension of (a) U + W, (b) U D W. 

(a) U + W is the space spanned by all six vectors. Hence, form the matrix whose rows are the given six 
vectors, and then row reduce to echelon form: 


'1 

3 

-2 

2 

3" 


'1 

3 

-2 

2 

3" 


'1 

3 

-2 

2 

3' 

1 

4 

-3 

4 

2 


0 

1 

-1 

2 

— 1 


0 

1 

-1 

2 

-1 

2 

3 

-1 

-2 

9 


0 

-3 

3 

-6 

3 


0 

0 

1 

0 

-1 

1 

3 

0 

2 

1 


0 

0 

2 

0 

-2 


0 

0 

0 

0 

0 

1 

5 

-6 

6 

3 


0 

2 

-4 

4 

0 


0 

0 

0 

0 

0 

2 

5 

3 

2 

1 


0 

-1 

7 

-2 

-5 


0 

0 

0 

0 

0 


The following three nonzero rows of the echelon matrix form a basis of U (T W: 

(1,3,-2,2,2,3), (0,1,-1,2,-1), (0,0,1,0,-1) 


Thus, dim(£/ + W) =3. 

(b) Let v = (x,y,z,s,t) denote an arbitrary element in R 5 . First find, say as in Problem 4.49, homogeneous 
systems whose solution sets are U and W, respectively. 

Let M be the matrix whose columns are the u t and v, and reduce M to echelon form: 


1 

1 

2 



'1 

l 

2 

x 

3 

4 

3 

y 


0 

l 

-3 

-3x + y 

-2 

-3 

-1 

z 


0 

0 

0 

—x+y+z 

2 

4 

-2 

s 


0 

0 

0 

4x — 2y + 5 

3 

2 

9 

t 


0 

0 

0 

—6x + y +1 


Set the last three entries in the last column equal to zero to obtain the following homogeneous system whose 
solution set is U: 


—x + y + z = 0, 4x — 2y + s = 0, —6x + y + t = 0 

Now let M' be the matrix whose columns are the vv,- and v, and reduce M' to echelon form: 


'1 

1 

2 



'1 

l 

2 

X 

3 

5 

5 

y 


0 

2 

-1 

—3x + y 

0 

-6 

3 

z 

~ 

0 

0 

0 

—9x + 3 y + z 

2 

6 

2 

s 


0 

0 

0 

4x — 2 v + 5 

1 

3 

1 

t 


0 

0 

0 

+ 

1 

<3 


Again set the last three entries in the last column equal to zero to obtain the following homogeneous system 
whose solution set is W: 

—9 + 3 + z = 0, 4x — 2y + s = 0, 2x — y + t = 0 

Combine both of the above systems to obtain a homogeneous system, whose solution space is U IT W, and 
reduce the system to echelon form, yielding 

-x + y+ z = 0 
2y + 4z + 5 = 0 

8z + 55 + 2t = 0 
5 — 2r = 0 

There is one free variable, which is f; hence, dim([7 Cl W) = 1. Setting t = 2, we obtain the solution 
u = (1,4, —3,4,2), which forms our required basis of U PI W. 

4.55. Suppose U and W are distinct four-dimensional subspaces of a vector space V, where dim V = 6. 
Find the possible dimensions of U n W. 

Because U and W are distinct, U + W properly contains U and W\ consequently, dim((7 + W) > 4. But 
dim({/ + W) cannot be greater than 6, as dim V = 6. Hence, we have two possibilities: (a) dim(t/ + W) = 5 
or (b) dim( U + W) = 6. By Theorem 4.20, 

dim( U (T W) = dim U + dim W — dim( U + W) =■ 8 — dim( U + W) 

Thus (a) dim({/ D W) = 3 or (b) dim(f/ DW) = 2. 
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4.56. Let U and W be the following subspaces of R 3 : 

U = {(a, b, c) : a — b = c} and W = {(0,b,c)} 

(Note that W is the yz-plane.) Show that R 3 = (/ffiW 

First we show that U fl W = {0}. Suppose v = (a, b,c) £ U HW. Then a = b = c and a = 0. Hence, 
a = 0, b = 0, c = 0. Thus, v = 0 = (0,0,0). 

Next we show that R 3 = U + W. For, if v = (a, b , c) G R 3 , then 

v = {a, a, a) + (0, b — a, c — a) where (a, a, a)Gt/ and (0, b — a, c — a) G W 

Both conditions U C\W = {0} and U + W = R 3 imply that R 3 = U © W. 

4.57. Suppose that U and W are subspaces of a vector space V and that S = {«, } spans U and 5' = {w.} 
spans W. Show that S U S' spans U + W. (Accordingly, by induction, if .S', spans W r for 
i — 1,2then S l U ... U S n spans W t + • • - + W„.) 

Let v G U + W. Then v = u + w, where u G U and w G W. Because S spans U, u is a linear combination 
of Uj, and as S' spans W, w is a linear combination of Wj\ say 

u = + a 2 u u + • • • + a r u ir and v = b x w^ + b 2 Wj + • • • + b s Wj s 

where a h bj G K. Then 

v = u + w = + a 2 Uj 2 + • • • + a r ii ir + b l Wj i + b 2 Wj + • • • + b s Wj 

Accordingly, S' = {«,, Wj} spans U + W. 

4.58. Prove Theorem 4.20: Suppose U and V are finite-dimensional subspaces of a vector space V. Then 
U + W has finite dimension and 

dim (U + W) = dim U + dim W - dim(C7 fi W) 

Observe that U fl W is a subspace of both U and W. Suppose dim U = m, dim W = n, dim( TJ C\W) = r. 
Suppose v r } is a basis of U fl W. By Theorem 4.16, we can extend {w,} to a basis of U and to a basis 

of W; say 

{vu>-.,v r ,u u ...,u m _ r } and {v u ... ,v r ,w u ... ,w„_ r } 
are bases of U and W, respectively. Let 

B={v u . v r ,u u ... m u m _ r , w u ..., w„_ r } 

Note that B has exactly m + n — r elements. Thus, the theorem is proved if we can show that B is a basis 
of U + W. Because {i> ; , Uj} spans U and (w ; , w A .} spans W, the union B = {u ; , Uj, w k } spans U + W. Thus, it 
suffices to show that B is independent. 

Suppose 

Q\Vi + • • • + u r v r + T * * * T b m _ r u m _ r + C\W\ + • • • + c n _ r w n _ r = 0 (1) 

where a h bj, c k are scalars. Let 

v = a i v 1 -\ - \-a r v r + b x u l -\ -F b m _ r u m _ r (2) 

By (1), we also have 

V = -c lWl - C n _ r w n _ r (3) 

Because (u ; , Uj} C U, v G U by (2); and as {w A ,} C W, v G W by (3). Accordingly, v G U fl W. Now {w ; } is a 
basis of f/fl IT and so there exist scalars d l ,... ,d r for which v = d l v k + ■ ■ ■ + d r v r . Thus, by (3), we have 

d\ V] + • • ■ + d r v r + C\Wi + • • • + c n _ r w n _ r — 0 

But {Vj, w k } is a basis of W, and so is independent. Hence, the above equation forces C] = 0,..., c n _ r = 0. 
Substituting this into (1), we obtain 

a l v 1 + ---+ a r v r + b l u l + --- + b m _ r u m _ r = 0 

But {v h Uj} is a basis of U, and so is independent. Hence, the above equation forces aj = 
0, ..., a r d, /q 0, ..., b ln _ r 0. 

Because (1) implies that the a h bj, c k are all 0, B = { v h Uj, w k } is independent, and the theorem is 
proved. 
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4.59. Prove Theorem 4.21: V = U © W if and only if (i ) V = U + W, (ii) U P\W = {0}. 

Suppose V = U © W. Then any v G V can be uniquely written in the form v = u + w, where u G U and 
w G W. Thus, in particular, V = U + W. Now suppose v G U PI W. Then 

(1) v = v + 0, where v G U, 0 G W, (2) v = 0 + v, where 0 G U, v G W. 

Thus, v = 0 + 0 = 0 and U fl W = {0}. 

On the other hand, suppose V = U + W and U fl W = {0}. Let v G V. Because V = U + W, there exist 
u G U and w G W such that v = u + vv. We need to show that such a sum is unique. Suppose also that 
v = u! + w', where u' G U and w' G W. Then 

u + w = u' + tv , and so u — u = w — w 
But u — u' G U and w' — w G W; hence, by U fl W = {0}, 

u — u = 0, w' — w = 0, and so u= u , w = w 
Thus, such a sum for v G V is unique, and V = U © W. 


4.60. Prove Theorem 4.22 (for two factors): Suppose V = U ®W. Also, suppose S = {u l ,..., u m } and 
S' = { vv,,..., w„} are linearly independent subsets of U and W. respectively. Then 

(a) The union S U S' is linearly independent in V. 

(b) If S and S' are bases of U and W. respectively, then S U S' is a basis of V. 

(c) dim V = dim U + dim W. 

(a) Suppose apq + • ■ • + ci m ii m + hjrvj + • • • + b n w„ = 0, where a h bj are scalars. Then 

(«t“t + ■'' + a m u m ) + (b^ + • • • + b n w n ) = 0 = 0 + 0 

where 0, a l u l + • • • + a m u m G U and 0, b\W\ + • • • + b n w n G W. Because such a sum for 0 is unique, 
this leads to 

fliMj +-b a m ii m = 0 and b l w l H-b b n w n = 0 

Because S'] is linearly independent, each a ; = 0, and because S 2 is linearly independent, each = 0. 
Thus, S = 5] U S 2 is linearly independent. 

(b) By part (a), S = 5\ U S 2 is linearly independent, and, by Problem 4.55, S = S l U S 2 spans V = U + W. 
Thus, S = 5] U S 2 is a basis of V. 

(c) This follows directly from part (b). 


Coordinates 

4.61. Relative to the basis S = {u l ,u 2 } = {(1,1), (2,3)} of R 2 , find the coordinate vector of v, where 
(a) v = (4, -3), (b) v = ( a,b ). 

In each case, set 

v = xu x +yu 2 = x(l, 1) +y(2, 3) = (x + 2y, x+ 3 y) 


and then solve for x and y. 
(a) We have 


(4, -3) = (x + 2y, x + 3y) or 

The solution is x = 18, y = —7. Hence, [u] = [18, —7). 

(b) We have 


X+2y= 4 
x + 3y = —3 


(a, b) = (.x + 2 y, x + 3v) or 


x + 2 y = a 
x + 3 y = b 


The solution is x = 3 a — 2b, y = —a + b. Hence, [«] = [3a — 2b, a + b\. 
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4.62. Find the coordinate vector of v = (a, b. c) in R 3 relative to 

(a) the usual basis E= {(1,0,0), (0,1,0), (0,0,1)}, 

(b) the basis S — {n 1; n 2 , tt 3 } = {(1,1,1), (1,1,0), (1,0,0)}. 

(a) Relative to the usual basis E, the coordinates of [d] £ are the same as v. That is, [u] £ = [a. b. c], 

(b) Set v as a linear combination of u l? u 2 , u 3 using unknown scalars x, y, z. This yields 

x + y + z = a 
or x + y = b 
x = c 

Solving the system yields x = c, y = b — c, z = a — b. Thus, [v] s = [c, b — c, a — b\. 


a 


"1 


1 


T 

b 

— X 

1 

+ :v 

1 

+ z 

0 

c 


1 


0 


0 


4.63. Consider the vector space P 3 (t) of polynomials of degree <3. 

(a) Show that S — {(f — l) 3 , (t — l) 2 , t—1, 1} is a basis of P 3 (t). 

(b) Find the coordinate vector [?'] of v = 3f 3 — 4t 2 +2t — 5 relative to S. 

(a) The degree of (t — 1) A is k\ writing the polynomials of S in reverse order, we see that no polynomial is a 
linear combination of preceding polynomials. Thus, the polynomials are linearly independent, and, 
because dimP 3 (?) = 4, they form a basis of P 3 (t). 

(b) Set v as a linear combination of the basis vectors using unknown scalars x, y, z, s. We have 

v = 3t 3 + 4t 2 + 2r — 5 = x(t — l) 3 + y(t — l) 2 + z(t — 1) + s(l) 

= x(t 3 — 3t 2 + 3 1 — 1) + y[t~ — 2t + 1) + z(t — 1) + ^(l) 

= xt 3 — 3 xt 2 + 3 xt — x + yt 2 — 2yt + v + zt — z + s 
= xt 3 + {-3x + y)t 2 + (3x - 2 y + z)t + (—x + y — z + s) 

Then set coefficients of the same powers of t equal to each other to obtain 

x = 3, —3x + y = 4, 3x — 2y + z = 2, — x + y — z + s = —5 

Solving the system yields x = 3, y = 13, z = 19, s = 4. Thus, [u] = [3, 13,19,4]. 


4.64. Find the coordinate vector of A = 


(a) 

(b) 


the basis S = | ^ 
the usual basis E = 


1 1 
i ’ [l 

1 0 
0 0 


2 3 
4 -7 

-1 

0 


0 1 
0 0 


in the real vector space M = M 2 2 


-1 

°J’ 
'0 0 
’ 1 0 


1 0 
0 0 

'0 

’ 0 



relative to 


(a) Set A as a linear combination of the basis vectors using unknown scalars x, y, z, t as follows: 



'2 3' 


'1 f 


'1 - r 


'1 - r 


1 (F 


' x + y _|_ j_|_ f X _y_ z ' 

A = 

4 - 7 

= X 

1 1 

+ .v 

1 0 

+ Z 

0 0 

+ 1 

0 0 

— 

x + y x 


Set corresponding entries equal to each other to obtain the system 

x + y + z + t = 2, x -y-z = 3, x + y = 4, x = -7 

Solving the system yields x = —7, y = 11. z = —21, t = 19. Thus, [A] s = [—7,11, —21,19]. (Note that 
the coordinate vector of A is a vector in R 4 . because dimM = 4.) 

(b) Expressing A as a linear combination of the basis matrices yields 


'2 3' 

4 -7 

= X 

1 0 ' 

0 0 


O O 

O H- 

+ Z 

'0 O' 
1 0 

+ t 

'0 o' 
0 1 

= 

F4 ^ 


Thus, x = 2, y = 3, z = 4, t = —7. Hence, [A] = [2, 3,4, —7], whose components are the elements of A 
written row by row. 
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Remark: This result is true in general; that is, if A is any m x n matrix in M = M m „, then the 
coordinates of A relative to the usual basis of M are the elements of A written row by row. 


4.65. In the space M = M 2 3 , determine whether or not the following matrices are linearly dependent: 


1 2 


4 0 


3 

5 ’ 


2 

10 1 


7 

13 ’ 


C = 


2 5 

2 11 


If the matrices are linearly dependent, find the dimension and a basis of the subspace W of M 
spanned by the matrices. 

The coordinate vectors of the above matrices relative to the usual basis of M are as follows: 

M] = [1,2,3,4,0,5], [£] = [2,4,7,10,1,13], [C] = [1,2,5,8,2,11] 

Form the matrix M whose rows are the above coordinate vectors, and reduce M to echelon fomi: 



'12 3 

4 

0 

5' 


'1 2 3 4 0 5' 

M = 

2 4 7 

10 

1 

13 

~ 

0 0 12 13 


1 2 5 

8 

2 

11 


0 0 0 0 0 0 

Because the echelon matrix has only two 

nonzero 

rows, the coordinate vectors [A] 


dimension two, and so they are linearly dependent. Thus, A, B, C are linearly dependent. Furthermore, 
dim W = 2, and the matrices 


and 


w 2 = 


corresponding to the nonzero rows of the echelon matrix form a basis of W. 


Miscellaneous Problems 

4.66. Consider a finite sequence of vectors S — {tq, v 2 , ■ ■ ■ , v„}. Let T be the sequence of vectors 
obtained from S by one of the following “elementary operations”: (i) interchange two vectors, 
(ii) multiply a vector by a nonzero scalar, (iii) add a multiple of one vector to another. Show that S 
and T span the same space W. Also show that T is independent if and only if S is independent. 

Observe that, for each operation, the vectors in T are linear combinations of vectors in S. On the other hand, 
each operation has an inverse of the same type (Prove!); hence, the vectors in 5 are linear combinations of 
vectors in T. Thus S and T span the same space W. Also, T is independent if and only if dim W = n, and this is 
true if and only if S is also independent. 

4.67. Let A = [ay] and B = [by] be row equivalent m x n matrices over a field K, and let vv n be 
any vectors in a vector space V over K. Let 

«t = a n v x + a n v 2 + • • • + a u v n w, = b n v x + b r2 v 2 + • ■ ■ + b ln v„ 

u 2 = a 21 v i + a 22 v 2 + • ■ ■ + a 2n v n w 2 = b 2i v l + b 22 v 2 + ■ ■ ■ + b 2n v„ 

ii m ^mi O "L n m2 n 2 T - * * * T - a mn v n w m b m \ o \ A b m2 v 2 * * ■ T - b mn v n 

Show that {uj} and {vc,} span the same space. 

Applying an “elementary operation” of Problem 4.66 to {t< ; } is equivalent to applying an elementary row 
operation to the matrix A. Because A and B are row equivalent, B can be obtained from A by a sequence of 
elementary row operations; hence, {w ; } can be obtained from {«,} by the corresponding sequence of 
operations. Accordingly, {m ; } and {w,} span the same space. 

4.68. Let Vi,... ,v n belong to a vector space V over K , and let P = [ay] be an /i-squarc matrix over K. Let 

w l =a u v l +a l2 v 2 + ---+a ln v n , ..., w n = a nl v x + a n2 v 2 + ■ ■ ■ + a nn v n 

(a) Suppose P is invertible. Show that {vv,} and {} span the same space; hence, {w,} is 
independent if and only if { v, } is independent. 

(b) Suppose P is not invertible. Show that {w,} is dependent. 

(c) Suppose {vv,} is independent. Show that P is invertible. 
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(a) Because P is invertible, it is row equivalent to the identity matrix I. Hence, by Problem 4.67, {w,} and 
{w,} span the same space. Thus, one is independent if and only if the other is. 

(b) Because P is not invertible, it is row equivalent to a matrix with a zero row. This means that {vv,} spans a 
space that has a spanning set of less than n elements. Thus, {vv/} is dependent. 

(c) This is the contrapositive of the statement of (b), and so it follows from (b). 

4.69. Find a homogeneous system whose solution space is spanned by 

vj = (1,2,1,2,1), v 2 = (1,3,2, 5,3), v 3 = (1,3,3,6,7) 

First we seek the orthogonal complement to the v’s, that is, the set of vectors tv = (a, b, c, d, e) orthogonal 
to Vi, v 2 , v 3 .. Accordingly, we seek the solution to the system 

ci -|- 2 b -(- c4 2 d T c — 0 of 2 b : c ) 2 d -j- e — 0 

a + 3b + 2c + 5d + 3e = 0 or b + c + 3d + 2e = 0 

a + 3b + 3c + 6d+le = 0 c + d + 4e = 0 

Here d and e are free variables. Setting (d, e) equal to (1, 0) and then (0, 1) yields the following two solutions 
of the system: 

Wt = (3,-2,-1,1,0) and w 2 = (5,2,-4,0,1) 

Thus the homogeneous system follows: 

3x — 2y — z + j = 0 and 5x + 2y — 4z + t = 0 
(Clearly the solution is not unique.) 

4.70. Let K be a subfield of a field L, and let L be a subfield of a field E. (Thus, K C L C E, and K is a 
subfield of E.) Suppose E is of dimension n over L, and L is of dimension m over K. Show that E is 
of dimension mn over K. 

Suppose {i> 1; ..., tt„} is a basis of E over L and {ay,... , a m } is a basis of L over K. We claim that 
{cijVj : i = 1,..., m,j = 1,..., /?} is a basis of E over K. Note that {a,t^} contains mn elements. 

Let w be any arbitrary element in E. Because {v l ,... ,v n } spans E over L, vv is a linear combination of 
the v t with coefficients in L: 

w = b l v l +b 2 v 2 + --- + b n v n , b t eL (1) 

Because {ay,..., a m } spans L over K, each bj 6 L is a linear combination of the a ; - with coefficients in K: 

b x = k n a x + k n a 2 H-F k lm a m 

b 2 = k 2X a l + k 22 a 2 H-1- k 2m a m 


b„ — k, A a l + k n2 a 2 H-h k mn a m 

where k l; G K. Substituting in (1), we obtain 

w = (k u a, + ■ ■ ■ + k lm a m )v l + (k 2l a x 4 F k 2m a m )v 2 H F ( k nl a l 4 4 - k nm a m )v n 

— k\\a x v x 4 - F k lm a m v x + k 2l a l v 2 4 - F k 2m a m v 2 4 - • • • 4 - k nl a x v n 4 - F k nm a m v n 

= /L kji( aj Vj) 

where k- fl 6 K. Thus, vv is a linear combination of the a t Vj with coefficients in K\ hence, { a t Vj } spans E over K. 

The proof is complete if we show that {a t Vj} is linearly independent over K. Suppose, for scalars 
Xp E K. we have = 0; that is, 

(x n a x v x + x l2 a 2 v l + • • • + x lm a m v j ) + ••• + (. x nl a x v n + x n2 a 2 v„ + • • • + x nm a m v,„) = 0 
or 

{ x n a l + x \2 a l 4-F*lm a m) v l 4-+ ( x nl a l + x n2 a 2 4-F x nm a m) v n = 0 

Because (wj,..., v n } is linearly independent over L and the above coefficients of the v ; belong to L , each 
coefficient must be 0: 


x u a \ + x n a 2 4-F x lmPm ~ 0’ 


* • 1 


x nl a 1 + x n2®2 + ' ' ' + x ,„„a m = 0 
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But {«!,... ,a,„} is linearly independent over K; hence, because the Xj t G K, 

*11 =0, *12 = 0, . .., x lm = 0, ..., jc„, = 0, x n2 = 0, . .., x nm = 0 

Accordingly, {a,i^} is linearly independent over K, and the theorem is proved. 


SUPPLEMENTARY PROBLEMS 


Vector Spaces 

4.71. Suppose u and v belong to a vector space V. Simplify each of the following expressions: 

(a) Ej = 4(5u — 6v) + 2{3u + v), (c) E 3 = 6(3u+ 2v) + 5u — 7v, 

(b) E 2 = 5(2m — 3v) + 4(7v + 8), (d) E 4 = 3(5u + 2/v). 

4.72. Let V be the set of ordered pairs (a, b) of real numbers with addition in V and scalar multiplication on V 
defined by 

(a,b) + (c,d) = (a + c, b + d) and k(a,b) = (ka, 0) 

Show that V satisfies all the axioms of a vector space except [M 4 ]—that is, except lu = u. Hence, [M 4 ] is not 
a consequence of the other axioms. 

4.73. Show that Axiom [A 4 ] of a vector space L(that u + v = v + u) can be derived from the other axioms for V. 

4.74. Let V be the set of ordered pairs (a, b) of real numbers. Show that V is not a vector space over R with 

addition and scalar multiplication defined by 

(i) (a, b) + (c, d) = (a + d, b + c) and k{a , b) = (, ka , kb), 

(ii) ( a , b) + (c, d) = (a + c, b + d) and k(a , b) = ( a , b), 

(iii) ( a,b ) + (c,d) = (0,0) and k(a,b) = ( ka,kb ), 

(iv) (a, b) + (c,d) = (, ac,bd ) and k(a,b) = ( ka,kb). 

4.75. Let V be the set of infinite sequences (a 1 ,« 2 ; • • •) in a field K. Show that V is a vector space over K with 
addition and scalar multiplication defined by 

{a\, 02 , ■ ■ •) + (bi,b 2 ,...) = (at + b u a 2 + b 2 , ...) and k(a u a 2 , ■ ■ ■) = (ka u ka 2 , ■■.) 

4.76. Let U and W be vector spaces over a field K. Let V be the set of ordered pairs (u, w) where u G U and w G W. 
Show that V is a vector space over K with addition in V and scalar multiplication on V defined by 

(it, w) + (u , w ) = (it + it, w + w') and k(u, w) = (,ku , kw) 

(This space V is called the external direct product of U and W.) 

Subspaces 

4.77. Determine whether or not IT is a subspace of R 3 where W consists of all vectors (a,b,c) in R 3 such that 

(a) a = 3b, (b ) a < b < c, (c) ab = 0, (d) a + b + c = 0, (e) b = a 2 , (/) a = 2b = 3c. 

4.78. Let V be the vector space of n-square matrices over a field K. Show that W is a subspace of V if W consists of 
all matrices A = [a^] that are 

(a) symmetric (A r = A or a l; = a-,-), (b) (upper) triangular, (c) diagonal, (d) scalar. 

4.79. Let AX = B be a nonhomogeneous system of linear equations in n unknowns; that is, B A 0. Show that the 
solution set is not a subspace of K ". 

4.80. Suppose U and W are subspaces of V for which U U W is a subspace. Show that U C W or W C U. 

4.81. Let V be the vector space of all functions from the real field R into R. Show that li 7 is a subspace of V where 
W consists of all: (a) bounded functions, (b) even functions. [Recall that/: R —► R is bounded if 3 M G R 
such that Mx G R, we have | /(jc)| < M\ and/(jc) is even if f(—x) = f(x),Mx G R.] 
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4.82. Let V be the vector space (Problem 4.75) of infinite sequences ( a x ,a 2 ,...) in a field K. Show that VP is a 
subspace of V if W consists of all sequences with (a) 0 as the first element, (b) only a finite number of 
nonzero elements. 

Linear Combinations, Linear Spans 

4.83. Consider the vectors u = (1,2, 3) and v = (2, 3,1) in R 3 . 

(a) Write w = (1,3, 8) as a linear combination of u and v. 

(b) Write w = (2,4, 5) as a linear combination of u and v. 

(c) Find k so that w = (1, fc, 4) is a linear combination of u and v. 

(d) Find conditions on a, b, c so that w = (a, b , c) is a linear combination of u and v. 

4.84. Write the polynomial /(f) = at 2 + bt + c as a linear combination of the polynomials P\ = {t— l) 2 , 

p 2 = t — 1, p 2 = 1. [Thus, p x , p 2 , p 2 span the space P 2 (f) of polynomials of degree < 2.] 

4.85. Find one vector in R 3 that spans the intersection of U and W where U is the xy-plane—that is, 

U = {(a,£>,0)}—and W is the space spanned by the vectors (1,1,1) and (1,2,3). 

4.86. Prove that span(S) is the intersection of all subspaces of V containing S. 

4.87. Show that span(S) = span(5' U {0}). That is, by joining or deleting the zero vector from a set, we do not 

change the space spanned by the set. 

4.88. Show that (a) If S C T, then span(S) C span(T). (b) span[span(S)] = span(S’). 

Linear Dependence and Linear Independence 

4.89. Determine whether the following vectors in R 4 are linearly dependent or independent: 

(a) (1,2,-3,1), (3,7,1,-2), (1,3,7,-4); (b) (1,3,1,-2), (2,5,-1,3), (1,3,7,-2). 

4.90. Determine whether the following polynomials u, v, w in P(f) are linearly dependent or independent: 

(a) u = r 3 — 4r 2 + 3f + 3, v = f 3 + 2t 2 + 4r — 1, w = 2? 3 — t 2 — 3t + 5; 

(b) u = t 3 - 5 1 1 - It + 3, v = t 3 - At 2 - 3r + 4, w = It 3 - lit 1 - It + 9. 

4.91. Show that the following functions/, g, h are linearly independent: 

(a) /(f) = e’, g(t) = sin f, h(t) = f 2 ; (b) f(t)=e’, g(t)=e 2 ', h(t) = t. 

4.92. Show that u = ( a , b) and v = (c , d) in K 2 are linearly dependent if and only if ad — be = 0. 

4.93. Suppose u, v, w are linearly independent vectors. Prove that S is linearly independent where 

(a) S = {u + v — 2w, u — v — w , u + w}; (b) S = {u + v — 3w, u + 3v — w, v + w}. 

4.94. Suppose {jq,..., u r , w,,..., w s } is a linearly independent subset of V. Show that 

span(w ; ) fl span(wd = {0} 

4.95. Suppose v t . v 2 ,. ■ ■, v n are linearly independent. Prove that S is linearly independent where 

(a) S = {a l v l ,a 2 v 2 , ■ ■ ■ ,a n v n ) and each a t / 0. 

(b) S = {vi,..., v k _ u w , v k+11 . ..,v„} and w = and b k + 0. 

4.96. Suppose (a n ,...,a ln ), (n 2 i, • • • ,a 2n ), ..., (a m i,... ,a mn ) are linearly independent vectors in K", and 
suppose Vi, v 2 ,..., v n are linearly independent vectors in a vector space V over K. Show that the following 
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vectors are also linearly independent: 

wi = flnt'i + ••• + a ln v n , w 2 = a 21 v 1 + ■ ■ ■ + a 2n v n , w m = a ml v l + • • • + a mn v n 


Basis and Dimension 

4.97. Find a subset of u 1 , u 2 , u 3 , u 4 that gives a basis for W = span(i/ ; ) of R 5 , where 

(a) i h = (1,1,1,2,3), u 2 ={ 1,2,-1,-2,1), u 3 = (3,5,-1, -2,5), u 4 = (1,2,1,—1,4) 

(b) = (1,-2,1,3,-1), u 2 — ( 2,4, 2, 6,2), u 3 = { 1,-3,1,2,1), « 4 = (3,-7,3,8,-1) 

( c ) «, = ( 1 , 0 , 1 , 0 , 1 ), « 2 = ( 1 , 1 , 2 , 1 , 0 ), « 3 = ( 2 , 1 , 3 , 1 , 1 ), « 4 = ( 1 , 2 , 1 , 1 , 1 ) 

(d) «, = (1,0,1,1,1), m 2 = (2,1,2,0,1), m 3 = (1,1,2,3,4), u 4 = (4,2,5,4,6) 

4.98. Consider the subspaces {/ = {(o, b,c,d) : b — 2c + d = 0} and W = {(a, b,c,d) : a = d,b = 2c} of R 4 . 
Find a basis and the dimension of (a) U, (b) W, (c) U Cl W. 

4.99. Find a basis and the dimension of the solution space W of each of the following homogeneous systems: 

(a) x + 2y — 2z + 2s — t = 0 (b) x + 2y — z + 3s — 4r = 0 

x + 2y — z + 3s — 2t = 0 2x + 4y — 2z — s + 5t = 0 

2x + 4y — lz + 5 + t = 0 2x + Ay — 2z + 4s — 2t = 0 


4.100. Find a homogeneous system whose solution space is spanned by the following sets of three vectors: 

(a) (1, -2,0,3,-1), (2,-3,2,5,-3), (1, -2.1,2,-2); 

(b) (1,1,2,1,1), (1,2,1,4,3), (3,5,4,9,7). 


4.101. Determine whether each of the following is a basis of the vector space P„(f): 

(a) (1. 1 + f, 1 +t + t\ 1 +t + t 2 + t 3 , l+t + t 2 + + +t n }; 

(b) (1 + r, t+t 1 , t 2 + t\ + + 

4.102. Find a basis and the dimension of the subspace W of P(7) spanned by 

(a) u = t 3 + 2t 2 — 2t + 1, v = t 3 + 3t 2 — 3f + 4, w = 2t 3 + It 2 — It + 11, 

(b) u = f 3 + t 2 — 3t + 2, v = 2r 3 + t 2 + t — 4, w = 4t 3 + 3 1 2 — 5t + 2. 


4.103. Find a basis and the dimension of the subspace W of V = M 22 spanned by 

1 -7 

-5 1 


A = 


1 -5 
-4 2 


B = 


1 1 
-1 5 


C = 


2 -4 
-5 7 


D = 


Rank of a Matrix, Row and Column Spaces 
4.104. Find the rank of each of the following matrices: 



'1 

3 

-2 

5 

4" 


'1 

2 

-3 

-2 


1 

1 

2' 

(a) 

1 

4 

1 

3 

5 

, (b) 

1 

3 

-2 

0 

, (c) 

4 

5 

5 

1 

4 

2 

4 

3 

3 

8 

-7 

-2 

5 

8 

1 


2 

7 

-3 

6 

13 


2 

1 

-9 

-10 


-1 

-2 

2 


4.105. For k = 1,2,..., 5, find the number n k of linearly independent subsets consisting of k columns for each of 
the following matrices: 


‘1 

1 

0 

2 

3' 


'1 

2 

1 

0 

2' 

1 

2 

0 

2 

5 

(b) B = 

1 

2 

3 

0 

4 

1 

3 

0 

2 

7 


1 

1 

5 

0 

6 


(a) A 
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4.106. Let (a) A = 


'1 

2 

1 

3 

1 

6 


'1 

2 

2 

1 

2 

r 

2 

4 

3 

8 

3 

15 

, (b) B = 

2 

4 

5 

4 

5 

5 

1 

2 

2 

5 

3 

11 

1 

2 

3 

4 

4 

6 

4 

8 

6 

16 

7 

32 


3 

6 

7 

7 

9 

10 


For each matrix (where C 1 ,..., C 6 denote its columns): 


(i) Find its row canonical form M. 

(ii) Find the columns that are linear combinations of preceding columns. 

(iii) Find columns (excluding C 6 ) that form a basis for the column space. 

(iv) Express C 6 as a linear combination of the basis vectors obtained in (iii). 


4.107. Determine which of the following matrices have the same row space: 


A = 


1 

3 


-2 -1 
-4 5 ’ 




3 

10 

1 


4.108. Determine which of the following subspaces of R 3 are identical: 

Uj = span[(l, 1, —1), (2,3,-1), (3,1,-5)], U 2 = span[(l,-1,-3), (3,-2,-8), (2,1,-3)] 

U 3 = span[(l, 1,1), (1, —1,3), (3,-1,7)] 


4.109. Determine which of the following subspaces of R 4 are identical: 

= span[( 1,2,1,4), (2,4,1,5), (3,6,2,9)], U 2 = span[(l,2,1,2), (2,4,1,3)], 

U 3 = span[(l,2,3,10), (2,4,3,11)] 


4.110. Find a basis for (i) the row space and (ii) the column space of each matrix M: 


"0 

0 

3 

1 

4 ' 


'1 

2 

1 

0 

r 

1 

3 

1 

2 

1 

, (b) M = 

1 

2 

2 

1 

3 

3 

9 

4 

5 

2 

3 

6 

5 

2 

7 

4 

12 

8 

8 

7 


2 

4 

1 

-1 

0 


4.111. Show that if any row is deleted from a matrix in echelon (respectively, row canonical) form, then the 
resulting matrix is still in echelon (respectively, row canonical) form. 

4.112. Let A and B be arbitrary m x n matrices. Show that rank(A + B) < rank(A) + rank(B). 

4.113. Let r = rank(A + B). Find 2x2 matrices A and B such that 

(a) r < rank(A), rank(B); (b) r = rank(A) = rank(B); (c) r > rank(A), rank(B). 

Sums, Direct Sums, Intersections 

4.114. Suppose U and W are two-dimensional subspaces of K 3 . Show that U fl W ^ {0}. 

4.115. Suppose U and W are subspaces of V such that dim U = 4, dim W = 5, and dim V = 7. Find the possible 
dimensions of U fl W. 

4.116. Let U and W be subspaces of R 3 for which dim [7=1, dim W = 2, and U % W. Show that R 3 = U © W. 

4.117. Consider the following subspaces of R 4 : 

U = span[(l, — 1, —1, —2,0), (1,-2,-2,0,-3), (1,-1,-2,-2,1)] 

W = span[(l, —2, —3,0, —2), (1,-1,-3,2,-4), (1,-1,-2,2,-5)] 
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(a) Find two homogeneous systems whose solution spaces are U and W, respectively. 

(b) Find a basis and the dimension of U PI W. 

4.118. Let (7|, U 2 , U 3 be the following subspaces of R 3 : 

(/[ = {(«, b,c) : a = c}, U 2 = {(a,b,c) : a + b + c = 0}, U 3 = {(0,0, c)} 

Show that (a) R 3 = U l + U 2 , (b) R 3 = U 2 + U 3 , (c) R 3 = U\ + U 3 . When is the sum direct? 

4.119. Suppose U, W t , W 2 are subspaces of a vector space V. Show that 

(j/nw,) + (un w 2 ) c un (w x + w 2 ) 

Find subspaces of R 2 for which equality does not hold. 

4.120. Suppose W 3 ,W 2 ,... ,W r are subspaces of a vector space V. Show that 

(a) span(W 1 , W 2 , ..., W r ) = W l + W 2 + • • • + W r . 

(b) If Sj spans W t for i = 1,..., r, then 5} U S 2 U • • • U S r spans W 1 + W 2 + ■ ■ ■ + W r . 

4.121. Suppose V = U © W. Show that dim V = dim U + dim W. 

4.122. Let S and T be arbitrary nonempty subsets (not necessarily subspaces) of a vector space V and let k be a 
scalar. The sum S + T and the scalar product kS are defined by 

S + T = (u + v : u e S, t)6T}, kS = {ku : u G S} 

[We also write w + S for {w} + S.] Let 

S = {(1,2), (2,3)}, T= {(1,4), (1,5), (2,5)}, w = (1,1), k = 3 

Find: (a) S+T, (b) w + S, (c) kS, (d) kT, (e) kS + kT, (f) k(S + T). 

4.123. Show that the above operations of S + T and kS satisfy 

(a) Commutative law: S + T = T + S. 

(b) Associative law: (5! + S 2 ) + S 3 = 5 1 ! + (S 2 + S 3 ). 

(c) Distributive law: k(S + T) = kS + kT. 

(d) 5 + {0} = {0} + 5 = 5 and 5 + V = V + S = V. 

4.124. Let V be the vector space of n-square matrices. Let U be the subspace of upper triangular matrices, and let W 
be the subspace of lower triangular matrices. Find (a) U D W, (b) U + W. 

4.125. Let V be the external direct sum of vector spaces U and W over a field K. (See Problem 4.76.) Let 

U = {(m,0) : u E U} and W = {(0, w) : w G W} 

Show that (a) U and W are subspaces of V, (b) V = U © W. 

4.126. Suppose V = U + W. Let V be the external direct sum of U and W. Show that V is isomorphic to V under 
the correspondence v = u + w <-> (m, w). 

4.127. Use induction to prove (a) Theorem 4.22, (b) Theorem 4.23. 

Coordinates 

4.128. The vectors zq = (1.—2) and u 2 = (4,-7) form a basis S of R 2 . Find the coordinate vector [t>] of v relative 
to S where (a) v = (5, 3), (b) v = ( a , b ). 

4.129. The vectors u 1 = (1,2,0), u 2 = (1, 3,2), u 3 = (0,1, 3) form a basis S of R 3 . Find the coordinate vector [u] 
of v relative to S where (a) v = (2,7, —4), (b) v = (a, b, c). 
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4.130. S — \t^ + £ 2 , t - hi, 1} is a basis of P 3 (r). Find the coordinate vector [«] of v relative to S where 

(a) v = 2r 3 + r — 4? + 2, (b) v = at 3 + bt 2 + ct + d. 

4.131. Let V = M 2 2 - Find the coordinate vector [A] of A relative to S where 

a b 
c d 

4.132. Find the dimension and a basis of the subspace W of P 3 (f) spanned by 

u = t* -f 2r^ — 3r + 4, v — 2r 3 -j- 5f“ — 4 1 -{- 7, w = r 3 A 4r~ A 1 A 2 


5 = 


1 1 
1 1 


1 -1 

1 0 


1 1 
0 0 


1 0 
0 0 


and (a) A = 


3 -5 
6 7 


(b) A = 


4.133. Find the dimension and a basis of the subspace W of M = M 23 spanned by 


A = 


1 

3 


2 1 
1 2 ’ 



4 3 

5 6 ’ 



2 3 
7 6 


Miscellaneous Problems 

4.134. Answer true or false. If false, prove it with a counterexample. 

(a) If u l , u 2 , «3 span V, then dimF = 3. 

(b) If A is a 4 x 8 matrix, then any six columns are linearly dependent. 

(c) If Ui, u 2 , m 3 are linearly independent, then u t , u 2 , m 3 , w are linearly dependent. 

(d) If Ui, u 2 , m 3 , m 4 are linearly independent, then dimF > 4. 

(e) If Ui, u 2 , u 2 span V, then w, u t , u 2 , u 3 span V. 

(f) If Ui, u 2 , m 3 , m 4 are linearly independent, then Wj, u 2 , m 3 are linearly independent. 

4.135. Answer true or false. If false, prove it with a counterexample. 

(a) If any column is deleted from a matrix in echelon form, then the resulting matrix is still in echelon form. 

(b) If any column is deleted from a matrix in row canonical form, then the resulting matrix is still in row 
canonical form. 

(c) If any column without a pivot is deleted from a matrix in row canonical form, then the resulting matrix 
is in row canonical form. 

4.136. Determine the dimension of the vector space W of the following n-square matrices: 

(a) symmetric matrices, (b) antisymmetric matrices, 

(c) scalar matrices, (d) diagonal matrices. 

4.137. Let ti,t 2 , • • •, t,i Fc symbols, and let K be any field. Let V be the following set of expressions where a, 6 K: 

a \h + fl 2 r 2 + • •' + a„t n 

Define addition in V and scalar multiplication on V by 

(a,?! +-F a n t n ) + {b x ti H-F b n t „) = + bi)t x H-+ (a„b nm )t„ 

k(ciiti ~F u 2 t 2 ~F * * * ~F Q n tn) — k@iti A ku 2 t o A * * ■ A kci n t n 

Show that V is a vector space over K with the above operations. Also, show that {7,,..., t n ) is a basis of V, 
where 

tj = Or, A ■ ■ ■ A 0/,. [ A 1 tj A Oty_, A * * * A 0 t n 

4.138. Suppose that A,, A 2 ,... are linearly independent sets of vectors and that A, C A 2 C ... . 

Show that the union A = A, U A 2 U ... is also linearly independent. 
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ANSWERS TO SUPPLEMENTARY PROBLEMS 


[Some answers, such as bases, need not be unique.] 

4.71. (a) E ] = 26 u — 22v\ (b) The sum 7v+ 8 is not defined, so E 2 is not defined; 

(c) E 3 = 23 u + 5v, (d) Division by v is not defined, so E 4 is not defined. 


4.77. 

(a) 

Yes; (b) 

No; e.g., (1,2,3) G IT but 

-2(1,2,3 )#W, 


(c) 

No; e.g., (l,i 

0,0), (0,1,0) £ W, but not their sum; (d) Yes; 


(e) 

No; e.g., (1, 

1,1) e IT but 2(1,1,1) £ IT 

(f) Yes 

4.79. 

The 

zero vector 0 

is not a solution. 


4.83. 

(a) 

w = 3«] — u 2 , (b) Impossible, (c) k = y, (d) la — 5b + c 

4.84. 

Using / = xp t + yp 2 + zp 3 , we get x = a, y = 

2a + b, z = a + b + c 

4.85. 

v = 

(2,1,0) 



4.89. 

(a) 

Dependent, 

(b) Independent 


4.90. 

(a) 

Independent, 

(b) Dependent 


4.97. 

(a) 

IA, j, 1 

(b) iq, Mt, m 3 ; (c) u 

j, u 2 , m 4 ; (d) Mj, m 2 , m 3 

4.98. 

(a) 

dim U = 3, 

(b) dim IT = 2, (c) 

dim((/n IT) = 1 

4.99. 

(a) 

Basis: {(2, — 

1,0,0,0), (4,0,1,-1,0), 

(3,0,1,0,1)}; dim IT = 3; 


(b) 

Basis: {(2, — 

1,0,0,0), (1,0,1,0,0)}; dim IT = 2 

4.100. 

(a) 

5x + y - z - 

j = 0, x + y — z— t = 0; 



(b) 3x — y — z = 0, 2x — 3y + s = 0, x — 2y + t = 0 

4.101. (a) Yes, (b) No, because dim P„ (t) = n + 1, but the set contains only n elements. 

4.102. (a) dim W = 2, (b) dim IT = 3 

4.103. dim IT = 2 

4.104. (a) 3, (b) 2, (c) 3 

4.105. (a) n l = 4, n 2 = 5, « 3 = n 4 = n s = 0; (b) n 1 = 4, n 2 = 6, n 3 = 3, « 4 = « 5 = 0 

4.106. (a) (i) AT = [1,2,0,1,0,3; 0,0,1,2,0,1; 0,0,0,0,1,2; 0]; 

(ii) C 2 , C 4 , C 6 ; (hi) C U C 3 ,C 5 ; (iv) C 6 = 3C, +C 3 + 2C 5 . 

(b) (i) M= [1,2,0,0, 3,1; 0,0,1,0,-1,-1; 0,0,0,1,1,2; 0]; 

(ii) C 2 , C 5 , C 6 ; (hi) C u C 3 , C 4 ; (iv) C 6 = Q - C 3 + 2C 4 

’ 1 0 7’ 

4.107. A and C are row equivalent to L ^ ^ I, but not B 

\ 1 0 -2l 

4.108. U l and U 2 are row equivalent to ^ ^ ^ , but not U 3 

4.109. U x and U 3 are row equivalent *-° q q j 3 > ^ ut not ^2 

4.110. (a) (i) (1,3,1,2,1), (0,0,1,-1.-1), (0,0,0,4,7); (ii) C u C 3 , C 4 ; 

(b) (i) (1,2,1,0,1), (0,0.1,1,2); (ii) C u C 3 
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4.113. 


4.115. 

4.117. 

4.118. 

4.119. 
4.122. 

4.124. 

4.128. 

4.129. 

4.130. 

4.131. 

4.132. 

4.133. 

4.134. 

4.135. 

4.136. 


(a) A = 

(c) A = 


dim (U PI W) = 2, 3, or 4 

3x + 4y — z — t = 0 


'1 

r 

, B = 

-1 


-1' 

0 

0 

0 


0 

'1 

o' 

, B = 

'0 

o' 


0 

0 

0 

1 



(b) A = 


1 0 
0 0 


B = 


0 2 
0 0 


(a) (i) 


(ii) 


4x + 2y — s = 0. 


4x + 2y + s = 0 9x + 2y + z + t = 0’ 

(b) Basis: {(1,-2,-5,0,0), (0,0,1,0,-1)}; dim(t/n W) = 2 

The sum is direct in (b) and (c). 

In R 2 , let U, V, W be, respectively, the line y = x, the x-axis, the y-axis. 

(a) {(2,6), (2,7), (3,7), (3,8), (4,8)}; (b) {(2,3), (3,4)}; 

(c) {(3,6), (6,9)}; (d) {(3,12), (3,15), (6,15)}; 

(e and f) {(6,18), (6,21), (9,21), (9,24), (12,24)} 

(a) Diagonal matrices, (b) V 

(a) [-47,13], (b) [-la-4b, 2a + b] 

(a) [-11,13,-10], (b) [c — 3b + la, —c + 3b — 6a, c — 2b + 4a] 

(a) [2, —1,-2, 2], (b) [a, b — c , c — b + a, d — c + b — a] 

(a) [7,— 1,—13,10], (b) [d, c — d, b + c—2d, a — b — 2c + 2d\ 

dim W = 2; basis: {f 3 + 2? 2 — 3t + 4, t 1 + 2? — 1} 
dim W = 2; basis: {[1,2,1.3,1,2], [0,0,1,1,3,2]} 

(a) False; (1,1), (1, 2), (2,1) span R 2 ; (b) True; 

(c) False; (1,0,0,0), (0,1,0,0), (0,0,1,0), w= (0,0,0,1); 

(d) True; (e) True; (f) True 


(a) True; (b) False; e.g. delete C 2 from 


1 0 3 
0 1 2 


; (c) True 


(a) }m(m+ 1 ), (b) \n(n — 1), (c) n, (d) 1 



Linear Mappings 


5.1 Introduction 


The main subject matter of linear algebra is the study of linear mappings and their representation by means 
of matrices. This chapter introduces us to these linear maps and Chapter 6 shows how they can be 
represented by matrices. First, however, we begin with a study of mappings in general. 


5.2 Mappings, Functions 


Let A and B be arbitrary nonempty sets. Suppose to each element in a G A there is assigned a unique 
element of B\ called the image of a. The collection / of such assignments is called a mapping (or map) 
from A into B, and it is denoted by 
f:A^B 

The set A is called the domain of the mapping, and B is called the target set. We write/(a), read “f of a” 
for the unique element of B that / assigns to a G A. 

One may also view a mapping /: A —> B as a computer that, for each input value a € A, produces a 
unique output/(a) € B. 

Remark: The term function is used synonymously with the word mapping, although some texts 
reserve the word “function” for a real-valued or complex-valued mapping. 

Consider a mapping /: A —> B. If A ' is any subset of A, then/(A') denotes the set of images of elements 
of A'; and if B' is any subset of B, then /' 1 (B') denotes the set of elements of A, each of whose image lies 
in B. That is, 

f{A') = { f{a ): a G A'} and / _1 (fl') = {a G A :/(a) € B'} 

We call/(A') the image of A' and /' fB') the inverse image or preimage of B' . In particular, the set of all 
images (i.e.,/(A)) is called the image or range off. 

To each mapping/: A —> B there corresponds the subset of Ax B given by {(a,f(a)): a G A}. We call 
this set the graph off. Two mappings /: A —> B and g : A —> 5 are defined to be equal, written / = g, if 
f(a) = g(a) for every a G A—that is, if they have the same graph. Thus, we do not distinguish between a 
function and its graph. The negation of / = g is written / / g and is the statement: 

There exists an a G A for which/(a) / g(a). 


Sometimes the “barred” arrow i—> is used to denote the image of an arbitrary element x G A under a 
mapping /: A —> B by writing 

x^f(x) 


This is illustrated in the following example. 
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EXAMPLE 5.1 

(a) Let /: R —> R be the function that assigns to each real number x its square x 2 . We can denote this function by 
writing 

f(x ) = x 2 or ,tH.r 

Here the image of —3 is 9, so we may write /(—3) = 9. However, / _1 (9) = {3, —3}. Also, 
/(R) = [0, oo) = {x: x > 0} is the image of / 

(b) Let A = {a, b, c, dj and B = {x, y, z, f}. Then the following defines a mapping / : A —> B: 
f{a) = y, fib) = X, f(c ) = z, /(<i) = y or f = {(o,v), (M), (c,z), (J,y)} 

The first defines the mapping explicitly, and the second defines the mapping by its graph. Here, 

f({a,b,d}) = {f(a)J(b)J(d)} = {y,x,y} = {x,y} 

Furthermore, /(A) = {x, y, z} is the image of f 

EXAMPLE 5.2 Let V be the vector space of polynomials over R, and let p(t) = 3 1 2 - 5l + 2. 

(a) The derivative defines a mapping D: V —> V where, for any polynomials/(f), we have D(/) = df /dt. Thus, 

D {p) = D(3 f 2 - 5t + 2) = 6t - 5 


(b) The integral, say from 0 to 1, defines a mapping J: V 

pi pi 

A 


3(f) = 


f(t) dt, 


and so 


J (p) = 


(3 f 


R. That is, for any polynomial/(f), 

5f + 2) = j 


Jo Jo 

Observe that the mapping in (b) is from the vector space V into the scalar field R, whereas the mapping in (a) is from 
the vector space V into itself. 


Matrix Mappings 

Let A be any m x n matrix over K. Then A determines a mapping F A : K" —*■ K m by 
F a (u) = Au 

where the vectors in K" and K"' are written as columns. For example, suppose 





1' 

" 1 

-4 

5 






and u — 

3 

2 

3 

-6 







-5 


F a {u ) = Au 


1 

2 



Remark: For notational convenience, we will frequently denote the mapping F A by the letter A, the 
same symbol as used for the matrix. 

Composition of Mappings 

Consider two mappings /: A —» B and g : B —> C, illustrated below: 

A-Ub -?-> C 

The composition of / and g, denoted by g ° f, is the mapping g <- f: A — C defined by 
(g°f)(a) = g(f(a)) 
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That is, first we apply / to a £ A, and then we apply g to f(o) £ B to get g( f(ci)) £ C. Viewing/ and g as 
“computers,” the composition means we first input a £ A to get the output/(a) £ B using/, and then we 
input f(a) to get the output g(f(d)) £ C using g. 

Our first theorem tells us that the composition of mappings satisfies the associative law. 

THEOREM 5.1: Let f: A —> B, g: B —> C, h: C D. Then 

h o (g of) = (/j o g) of 

We prove this theorem here. Let a £ A. Then 

(■ h°(g°f))(a ) = h((gof)( a )) = h(g{f(a))) 

(( h°g) °f)(a) = (h°g)(f(a)) = h{g{f(a))) 

Thus, (h ° (gof))(a) = ((hog) °f)(a) for every a £ A, and so h° (gof) = (h°g) of 

One-to-One and Onto Mappings 

We formally introduce some special types of mappings. 

DEFINITION: A mapping /: A —> B is said to be one-to-one (or 1-1 or injective ) if different elements of 

A have distinct images; that is. 

If/(a) =/(</). then a — a'. 


DEFINITION: A mapping/: A —> B is said to be onto (or/ maps A onto B or surjective ) if every & £ B 

is the image of at least one a £ A. 

DEFINITION: A mapping /: A —> B is said to be a one-to-one correspondence between A and B (or 

bijective) if / is both one-to-one and onto. 

EXAMPLE 5.3 Let /: R -*■ R, g : R -> R, h : R -> R be defined by 
f(x) — 2 X , g(x) = x 3 — x, h(x) = x 2 

The graphs of these functions are shown in Fig. 5-1. The function / is one-to-one. Geometrically, this means 
that each horizontal line does not contain more than one point of/ The function g is onto. Geometrically, this 
means that each horizontal line contains at least one point of g. The function h is neither one-to-one nor onto. 
For example, both 2 and —2 have the same image 4, and —16 has no preimage. 





Figure 5-1 


Identity and Inverse Mappings 

Let A be any nonempty set. The mapping / : A —► A defined by f(a) = a —that is, the function that assigns 
to each element in A itself—is called identity mapping. It is usually denoted by 1 A or 1 or I. Thus, for any 
a £ A, we have 1 A (a) — a. 
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Now let /: A —> B. We call g : B —► A the inverse of/ written/ 1 , if 
f°g = l B and g°f= 1 A 

We emphasize that / has an inverse if and only if/ is a one-to-one correspondence between A and B: that 
is, / is one-to-one and onto (Problem 5.7). Also, if b € B, then /' '(/;) = a, where a is the unique element 
of A for which f(a ) = b 


5.3 Linear Mappings (Linear Transformations) 


We begin with a definition. 

DEFINITION: Let V and U be vector spaces over the same field K. A mapping F : L — (/is called a 

linear mapping or linear transformation if it satisfies the following two conditions: 

(1) For any vectors v, w 6 V, F(v + w) = F{v) + F(w). 

(2) For any scalar k and vector v € V, F(kv ) = kF(v). 

Namely, F: V —> U is linear if it “preserves” the two basic operations of a vector space, that of vector 
addition and that of scalar multiplication. 

Substituting k — 0 into condition (2), we obtain F(O') = 0. Thus, every linear mapping takes the zero 
vector into the zero vector. 

Now for any scalars a,b £ K and any vector v, w £ V. we obtain 
F(av + bw ) = F(av) + F(bw) = aF(v) + bF(w ) 

More generally, for any scalars € K and any vectors v t € V. we obtain the following basic property of 
linear mappings: 

F(a lVl + a 2 v 2 + • • • + a m v m ) = a^vf) + a 2 F(v 2 ) + • • • + a m F(v m ) 


Remark 1: A linear mapping F: V —*■ U is completely characterized by the condition 
F(av + bw) = aF(v) + bF(w) (*) 

and so this condition is sometimes used as its defintion. 

Remark 2: The term linear transformation rather than linear mapping is frequently used for linear 
mappings of the form F : R" —» R"\ 

EXAMPLE 5.4 

(a) Let F : R ! ^ R be the “projection” mapping into the ry-plane; that is, F is the mapping defined by 
F(x,y,z) = (x,y, 0). We show that F is linear. Let v = ( a,b,c ) and w = (a l ,b',c r ). Then 

F(v + w) — F(a + a 1 , b + if , c + c') = (a + a', b + b', 0) 

= (a, b , 0) + (ab' , 0) = F(v) + F(w) 

and, for any scalar k, 

F(kv ) = F(ka, kb, kc ) = (ka, kb, 0) = k(a, b, 0) = kF(v) 

Thus, F is linear. 

(b) Let G : R 2 -*■ R 2 be the “translation” mapping defined by G(x,y) = (x + 1, y + 2). [That is, G adds the vector 
(1,2) to any vector v = (x,y) in R 2 .] Note that 

G(0) = G(0,0) = (1,2)^0 


Thus, the zero vector is not mapped into the zero vector. Hence, G is not linear. 
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EXAMPLE 5.5 (Derivative and Integral Mappings) Consider the vector space V = P(f) of polynomials over the real 
field R. Let u(t) and v(t) be any polynomials in V and let k be any scalar. 

(a) Let D: V —> V be the derivative mapping. One proves in calculus that 

d(u + v) du dv 

dt dt dt 

That is, D (u + v) = D(ji) + D(v) and D (leu) = kD(u). Thus, the derivative mapping is linear. 

(b) Let J : V -> R be an integral mapping, say 

cl 


and 


d(ku) du 

dt dt 


J(/W) = 


fit) dt 


o 


One also proves in calculus that, 

pi ft 


[u(t) + v{t)\dt = 


0 
and 


u(t) dt ■ 


o 


v(t) dt 


t 1 

ku(t) dt = k 
. o 

That is, J (u + v) - 


•l 

u(t) dt 
o 

J(m) + J(v) and J (ku) = kj(u). Thus, the integral mapping is linear. 


EXAMPLE 5.6 (Zero and Identity Mappings) 

(a) Let F: V —> U be the mapping that assigns the zero vector 0 E U to every vector v E V. Then, for any vectors 
v,w E V and any scalar k € K, we have 

F(v + w) = 0 = 0 + 0 = F(v) + F(w) and F(kv) = 0 = kO = kF(v) 

Thus, F is linear. We call F the zero mapping, and we usually denote it by 0. 

(b) Consider the identity mapping 7: V — > V, which maps each v E V into itself. Then, for any vectors v,w E V and 
any scalars a, b E K, we have 

I{av + bw ) = av + bw = al(v) + bl(w) 

Thus, 7 is linear. 

Our next theorem (proved in Problem 5.13) gives us an abundance of examples of Unear mappings. In 
particular, it tells us that a linear mapping is completely determined by its values on the elements of a basis. 


THEOREM 5.2: Let V and U be vector spaces over a field K. Let {uj, v 2 , ■ ■ ■, v n } be a basis of V and let 
M[,m 2 , be any vectors in U. Then there exists a unique linear mapping 
F : V —> U such that F(u 1 ) = u l ,F(v 2 ) = u 2 , ■ ■ ■ ,F(v n ) = u n . 

We emphasize that the vectors Mj, u 2 , ■ ■ ■ , «„ in Theorem 5.2 are completely arbitrary; they may be 
linearly dependent or they may even be equal to each other. 


Matrices as Linear Mappings 

Let A be any real m x n matrix. Recall that A determines a mapping F A : K" — K m by F A (u) = Au (where 
the vectors in K" and K'" are written as columns). We show F A is linear. By matrix multiplication, 

F a (v + w) — A(v + w) = Av +Aw = F a (v) + F A (w) 

F A (kv) = A(kv) — k(Av) = kF A (v) 

In other words, using A to represent the mapping, we have 
A(v + w)—Av + Aw and A(kv) = k(Av) 

Thus, the matrix mapping A is linear. 
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Vector Space Isomorphism 

The notion of two vector spaces being isomorphic was defined in Chapter 4 when we investigated the 
coordinates of a vector relative to a basis. We now redefine this concept. 

DEFINITION: Two vector spaces V and U over K are isomorphic, written V = U, if there exists a 

bijective (one-to-one and onto) linear mapping F: V —» JJ. The mapping F is then called 
an isomorphism between V and U. 

Consider any vector space V of dimension n and let S be any basis of V. Then the mapping 

v i * [v] s 

which maps each vector v € V into its coordinate vector [?;]<., is an isomorphism between V and K n . 

5.4 Kernel and Image of a Linear Mapping 

We begin by defining two concepts. 

DEFINITION: Let F : V —> U be a linear mapping. The kernel of F, written Ker F, is the set of elements 

in V that map into the zero vector 0 in U\ that is, 

Ker F — {v € V: F(v) = 0} 

The image (or range ) of F, written Im F, is the set of image points in U ; that is, 

Im F — {u £ U : there exists v € V for which F(v) = u) 

The following theorem is easily proved (Problem 5.22). 

THEOREM 5.3: Let F: V ^ U be a linear mapping. Then the kernel of F is a subspace of V and the 
image of F is a subspace of U. 

Now suppose that Uj, v 2 , ■ ■ ■, v m span a vector space V and that F: V —> U is linear. We show that 
F(v 1 ),F(v 2 ), ... ,F(v m ) span Im F. Let u (E Im F. Then there exists v € V such that F(v) = it. Because 
the Vi's span V and v G V. there exist scalars «,. a 2 ,... ,a m for which 

v = a l v l +a 2 v 2 + • • • +a m v m 

Therefore, 

u = F(v) = F(a lVl + a 2 v 2 + ■ ■ ■ + a m v m ) = n 1 F(n 1 ) + a 2 F(v 2 ) + • • • + a m F(v m ) 

Thus, the vectors F(v 1 ),F( v 2 ), ... ,F(v m ) span Im F. 

We formally state the above result. 

PROPOSITION 5.4: Suppose v l ,v 2 , ■ ■ ■, v m span a vector space V, and suppose F: V —> U is linear. 
Then F(n 1 ),F(n 2 ),... ,F(v m ) span Im F. 

EXAMPLE 5.7 

(a) Let F: R 3 -*■ R 3 be the projection of a vector v into the xy-plane [as pictured in Fig. 5-2(a)]; that is, 

F(x,y,z ) = (x,y,0) 

Clearly the image of F is the entire xy-plane—that is, points of the form ( x , y, 0). Moreover, the kernel of F is the 
z-axis—that is, points of the form (0,0, c). That is, 

Im F = {(a, b, c): c = 0} = xy-plane and Ker F = {(a, b, c): a = 0, b = 0} = z-axis 

(b) Let G : R 3 -> R 3 be the linear mapping that rotates a vector v about the z-axis through an angle 9 [as pictured in 
Fig. 5-2(b)]; that is, 

G(x,y, z) — (xcosO - ysind, .vsin 0 - ycos 0, z) 
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Figure 5-2 


Observe that the distance of a vector v from the origin O does not change under the rotation, and so only the zero 
vector 0 is mapped into the zero vector 0. Thus, Ker G = {0}. On the other hand, every vector u in R 3 is the image of 
a vector v in R 3 that can be obtained by rotating u back by an angle of 9. Thus, Im G = R 3 , the entire space. 

EXAMPLE 5.8 Consider the vector space V = P(r) of polynomials over the real field R, and let // : V be the 

third-derivative operator; that is, H[f(t)\ = cPf /dt 3 . [Sometimes the notation D 3 is used for H, where D is the 
derivative operator.] We claim that 

Ker H = {polynomials of degree < 2} = P 2 {t) and Im H — V 

The first comes from the fact that H(at 2 + bt + c) = 0 but H(t n ) ^ 0 for n > 3. The second comes from that fact that 
every polynomial g(t) in V is the third derivative of some polynomial f(t) (which can be obtained by taking the 
antiderivative of g(t) three times). 


Kernel and Image of Matrix Mappings 

Consider, say, a 3 x 4 matrix A and the usual basis {e 1 ,e 2 ,e 3 ,e 4 } of K 4 (written as columns): 


A = 

a \ 

b \ 

a 2 

b 2 

a 3 

b 3 

a 4 

b 4 

, c i = 

T 

0 

0 


_ c i 

Cl 

Cl 

c 4 _ 


0 



'O' 


'O' 


'o' 


1 


0 


0 

e 2 = 

0 

. e 3 = 

1 

5 ^4 

0 


0 


0 


1 


Recall that A may be viewed as a linear mapping A: K 4 — K'. where the vectors in K 4 and K 3 are viewed 
as column vectors. Now the usual basis vectors span K 4 , so their images Ae 1 ,Ae 2 , Ae 3 , Ae 4 span the image 
of A. But the vectors Ae x , Ae 2 , Ae 3 , Ae 4 are precisely the columns of A: 


Ae l = [a 1 ,b 1 , Cj] 7 , Ae 2 = [a 2 ,b 2 ,c 2 ] T , Ae 3 = [a 3l b 3 ,c 3 ] r , Ae 4 = [a 4l b 4 ,c 4 ] T 

Thus, the image of A is precisely the column space of A. 


On the other hand, the kernel of A consists of all vectors v for which A v — 0. This means that the kernel 
of A is the solution space of the homogeneous system AX — 0, called the null space of A. 

We state the above results formally. 


PROPOSITION 5.5: Let A be any m x n matrix over a field K viewed as a linear map A : K" —> K m . Then 
Ker A = nullsp(A) and Im A = colsp(A) 

Here colsp(A) denotes the column space of A, and nullsp(A) denotes the null space of A. 
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Rank and Nullity of a Linear Mapping 

Let F: V —» U be a linear mapping. The rank of F is defined to be the dimension of its image, and the 
nullity of F is defined to be the dimension of its kernel; namely, 

rank(f') = dim(Im F) and nullity(F) = dim(Ker F) 

The following important theorem (proved in Problem 5.23) holds. 

THEOREM 5.6 Let V be of finite dimension, and let F: V —> U be linear. Then 
dim V = dim(Ker F) + dim(Im F) = nullity (F) + rank(F) 

Recall that the rank of a matrix A was also defined to be the dimension of its column space and row 
space. If we now view A as a linear mapping, then both definitions correspond, because the image of A is 
precisely its column space. 


EXAMPLE 5.9 Let F : R 4 —> R 3 be the linear mapping defined by 

F(x, y, z, t) — (x - y + z + t, 2x - 2y + 3z + 4 1 , 3x - 3 y + 4z + 5 1) 

(a) Find a basis and the dimension of the image of F. 

First find the image of the usual basis vectors of R 4 , 


F(l, 0,0,0) = (1,2, 3), 

F(0,1,0,0) = (—1, —2, —3), 


F(0,0,1,0) = (1,3,4) 
F(0,0,0,1) = (1,4,5) 


By Proposition 5.4, the image vectors span Im F. Hence, form the matrix M whose rows are these image vectors 
and row reduce to echelon form: 


12 3 

-1 -2 -3 

1 3 4 

1 4 5 


"12 3 
0 0 0 
0 1 1 
0 2 2 


"12 3 

0 1 1 
0 0 0 
0 0 0 


Thus, (1,2, 3) and (0,1,1) form a basis of Im F. Hence, dim(Im F) = 2 and rank(F) = 2. 

(b) Find a basis and the dimension of the kernel of the map F. 

Set F{v) = 0, where v = (x,y,z, t), 

F(x, y, z, t) — {x — y + z + t, 2x - 2y + 3z + 4 1 , 3x - 3y + 4z + 5 1) = (0,0,0) 

Set corresponding components equal to each other to fomi the following homogeneous system whose solution 
space is Ker F: 

x — y + z+ t = 0 x - y + z+ t = 0 __ .. 

2x - 2y + 3z + At = 0 or z + 2t = 0 or X ' Z _ 

3x-3y + 4z + 5t = 0 z + 2t = 0 z + zt-U 


The free variables are y and t. Hence, dim(Ker F) = 2 or nullity(F) = 2. 

(i) Set y = 1, t = 0 to obtain the solution (— 1, 1,0,0), 

(ii) Set y = 0, t = 1 to obtain the solution (1,0, —2,1). 

Thus, (— 1, 1,0, 0) and (1,0, —2,1) form a basis for Ker F. 

As expected from Theorem 5.6, dim(Im F) + dim(Ker F) = 4 = dimR 4 . 

Application to Systems of Linear Equations 

Let AX = B denote the matrix form of a system of m linear equations in n unknowns. Now the matrix A 
may be viewed as a linear mapping 
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Thus, the solution of the equation AX = B may be viewed as the preimage of the vector B £ K m under the 
linear mapping A. Furthermore, the solution of the associated homogeneous system 

AX = 0 

may be viewed as the kernel of the linear mapping A. Applying Theorem 5.6 to this homogeneous system 
yields 

dim(Ker A) = dim K" — dim(Im A) — n — rank A 

But n is exactly the number of unknowns in the homogeneous system AX = 0. Thus, we have proved the 
following theorem of Chapter 4. 

THEOREM 4.19: The dimension of the solution space W of a homogenous system AX = 0 of linear 
equations is s = n — r, where n is the number of unknowns and r is the rank of the 
coefficient matrix A. 

Observe that r is also the number of pivot variables in an echelon form of AX — 0, so s = n — r is also 
the number of free variables. Furthermore, the 5 solution vectors of AX = 0 described in Theorem 3.14 are 
linearly independent (Problem 4.52). Accordingly, because dim W = s, they form a basis for the solution 
space W. Thus, we have also proved Theorem 3.14. 


5.5 Singular and Nonsingular Linear Mappings, Isomorphisms 


Let F: V —> U be a linear mapping. Recall that F(O') = 0. F is said to be singular if the image of some 
nonzero vector v is 0—that is, if there exists such that F( v) = 0. Thus, F : V —> U is nonsingular if 
the zero vector 0 is the only vector whose image under F is 0 or, in other words, if Ker F = {0}. 

EXAMPLE 5.10 Consider the projection map F: R 3 —> R 3 and the rotation map G : R 3 —> R appealing in 
Fig. 5-2. (See Example 5.7.) Because the kernel of F is the z-axis, F is singular. On the other hand, the kernel of 
G consists only of the zero vector 0. Thus, G is nonsingular. 

Nonsingular linear mappings may also be characterized as those mappings that carry independent sets 
into independent sets. Specifically, we prove (Problem 5.28) the following theorem. 

THEOREM 5.7: Let F : V —> U be a nonsingular linear mapping. Then the image of any linearly 
independent set is linearly independent. 


Isomorphisms 

Suppose a linear mapping F: V —> U is one-to-one. Then only 0 £ V can map into 0 £ U, and so F is 
nonsingular. The converse is also true. For suppose F is nonsingular and F(v) = F(w), then 
F(v — w) = F(v) — F(w ) = 0, and hence, u—w^O or v — w. Thus, F(v) = F(w ) implies v = w — 
that is, F is one-to-one. We have proved the following proposition. 

PROPOSITION 5.8: A linear mapping F: V —» U is one-to-one if and only if F is nonsingular. 

Recall that a mapping F : V —> U is called an isomorphism if F is linear and if F is bijective (i.e., if F is 
one-to-one and onto). Also, recall that a vector space V is said to be isomorphic to a vector space U, 
written V = U, if there is an isomorphism F: V —► JJ. 

The following theorem (proved in Problem 5.29) applies. 

THEOREM 5.9: Suppose V has finite dimension and dimT = dimfZ Suppose F: V —> U is linear. 
Then F is an isomorphism if and only if F is nonsingular. 
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5.6 Operations with Linear Mappings 


We are able to combine linear mappings in various ways to obtain new linear mappings. These operations 
are very important and will be used throughout the text. 

Let F: V —> U and G: V ^ U be linear mappings over a field K. The sum F + G and the scalar 
product kF, where k G K, are defined to be the following mappings from V into U: 

(F + G)(v) = F(v) + G(v) and ( kF)(v ) = kF(v) 

We now show that if F and G are linear, then F + G and kF are also linear. Specifically, for any vectors 
v, w £ V and any scalars a, b £ K, 

(F + G)(av + bw) = F(av + bw ) + G(av + bw) 

— aF(v) + bF(w) + aG(v) + bG(w) 

= a[F(v) + G(v )] + b[F(w) + G(w )] 

= a(F + G)(v) + b(F + G)(w) 

and (. kF)(av + bw) — kF(av + bw) — k[aF(v) + bF(w )] 

= akF(v) + bkF(w) — a(kF)(v) + b(kF)(w) 

Thus, F + G and kF are linear. 

The following theorem holds. 

THEOREM 5.10: Let V and U be vector spaces over a field K. Then the collection of all linear mappings 
from V into U with the above operations of addition and scalar multiplication forms a 
vector space over K. 

The vector space of linear mappings in Theorem 5.10 is usually denoted by 
Hom(V, U) 

Here Horn comes from the word "homomorphism.” We emphasize that the proof of Theorem 5.10 reduces 
to showing that Hom(V, U) does satisfy the eight axioms of a vector space. The zero element of 
Hom(V, U) is the zero mapping from V into U, denoted by 0 and defined by 

0(v) = 0 

for every vector v € V. 

Suppose V and U are of finite dimension. Then we have the following theorem. 

THEOREM 5.11: Suppose dim V = m and dim U = n. Then dirn[Horn( V. U)\ = mn. 

Composition of Linear Mappings 

Now suppose V, U, and W are vector spaces over the same field K. and suppose F : V —> U and 
G: U ^ W are linear mappings. We picture these mappings as follows: 

V u w 

Recall that the composition function G°F is the mapping from V into W defined by 
( G°F)(v ) = G(F(v)). We show that G°F is linear whenever F and G are linear. Specifically, for any 
vectors v, w £ V and any scalars a. b G K. we have 

( G°F)(av + bw) = G(F(av + bw)) = G{aF{v) + bF(w)) 

= aG(F(v)) + bG{F(w)) = a{Go F){v) + b{G° F){w) 

Thus, G 0 F is linear. 

The composition of linear mappings and the operations of addition and scalar multiplication are related 
as follows. 
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THEOREM 5.12: Let V, U, W be vector spaces over K. Suppose the following mappings are linear: 

F: V -> I/, F' :V -> U and G : U -> W, G': U -» W 
Then, for any scalar k £ K: 

(i) G°(F + F') — G°F + G°F'. 

(ii) (G + G')oF=GoF + G'oF. 

(iii) k(G° F) = ( kG) ° F = G°{kF). 

5.7 Algebra A(V ) of Linear Operators 


Let V be a vector space over a field K. This section considers the special case of linear mappings from the 
vector space V into itself—that is, linear mappings of the form F: V —> V. They are also called lineal- 
operators or linear transformations on V. We will write A(V), instead of Hom(V, V), for the space of all 
such mappings. 

Now A(V) is a vector space over K (Theorem 5.8), and, if dim V — n. then dimA(V) = n 2 . Moreover, 
for any mappings F, G € A(V), the composition G°F exists and also belongs to A(V). Thus, we have a 
“multiplication” defined in A(V). [We sometimes write FG instead of G° F in the space A(V).\ 

Remark: An algebra A over a field K is a vector space over K in which an operation of 
multiplication is defined satisfying, for every F,G,H £ A and every k £ K: 

(i) F(G + H) = FG + FH, 

(ii) (G + H)F =GF + HF, 

(iii) k(GF) = ( kG)F = G(kF). 

The algebra is said to be associative if, in addition, ( FG)H = F(GH). 

The above definition of an algebra and previous theorems give us the following result. 

THEOREM 5.13: Let V be a vector space over K. Then A(V) is an associative algebra over K with 
respect to composition of mappings. If dim V = n, then dimA( V) = n 2 . 

This is why A( V) is called the algebra of linear operators on V. 

Polynomials and Linear Operators 

Observe that the identity mapping / : V —> V belongs to A( V). Also, for any linear operator F in A(V), we 
have FI — IF — F. We can also form “powers” of F. Namely, we define 

F° — I, F 2 = FoF, F 3 =F 2 °F = F°F°F, F 4 = F 3 °F, ... 

Furthermore, for any polynomial p(t) over K, say, 

= t7Q + ap T ^ 2 ?” + ■ • • + af~ 

we can form the linear operator p(F) defined by 

p(F) = OqI T a^F T a^F T * ■ ■ T a s F 

(For any scalar k, the operator Id is sometimes denoted simply by k.) In particular, we say Fisa zero of the 
polynomial p(t) if p(F) = 0. 

EXAMPLE 5.11 Let F: K 2 -*■ K 2 be defined by F(x,y, z) = {0,x,y). For any (a, b. c) G K\ 

(F + I)(a, b, c) = (0, a, b) + (a, b, c) = (a, a + b, b + c) 

F 3 (o, b, c) = F 2 (0, a, b) = F(0,0,a) = (0,0,0) 

Thus, F 3 = 0, the zero mapping in A(V). This means F is a zero of the polynomial p(t) = f 3 . 
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Square Matrices as Linear Operators 

Let M = M„ „ be the vector space of all square n x n matrices over K. Then any matrix A in M defines a 
linear mapping F A : K n —> K n by F A (u) = An (where the vectors in K" are written as columns). Because the 
mapping is from K" into itself, the square matrix A is a linear operator, not simply a linear mapping. 

Suppose A and B are matrices in M. Then the matrix product AB is defined. Furthermore, for any 
(column) vector u in K", 

F ab {u) = (AB)u = A(Bu) = A(F b (U)) = F A (F B (u)) = (F A oF B )(u) 

In other words, the matrix product AB corresponds to the composition of A and B as linear mappings. 
Similarly, the matrix sum A + B corresponds to the sum of A and B as linear mappings, and the scalar 
product kA corresponds to the scalar product of A as a linear mapping. 

Invertible Operators in A(V ) 

Let F: V —> V be a linear operator. F is said to be invertible if it has an inverse—that is, if there exists F 1 
in A(V) such that FF 1 = F 1 F = I. On the other hand, F is invertible as a mapping if F is both one-to- 
one and onto. In such a case, F 1 is also linear and F 1 is the inverse of F as a linear operator (proved in 
Problem 5.15). 

Suppose F is invertible. Then only 0 € V can map into itself, and so F is nonsingular. The converse is 
not true, as seen by the following example. 

EXAMPLE 5.12 Let V = P(f), the vector space of polynomials over K. Let F be the mapping on V that increases 
by 1 the exponent of t in each temi of a polynomial; that is, 

F(a 0 T tqf T $2^” + • • ■ + Q s t ) = ciftt T ci^l~ T U 2 I T * * * T ci s t ^ 

Then F is a linear mapping and F is nonsingular. However, F is not onto, and so F is not invertible. 

The vector space V = P(f) in the above example has infinite dimension. The situation changes 
significantly when V has finite dimension. Namely, the following theorem applies. 

THEOREM 5.14: Let F be a linear operator on a finite-dimensional vector space V. Then the following 
four conditions are equivalent. 

(i) F is nonsingular: Ker F = {0}. (iii) F is an onto mapping. 

(ii) F is one-to-one. (iv) F is invertible. 

The proof of the above theorem mainly follows from Theorem 5.6, which tells us that 
dim V = dim(Ker F) + dim(Im F) 

By Proposition 5.8, (i) and (ii) are equivalent. Note that (iv) is equivalent to (ii) and (iii). Thus, to prove the 
theorem, we need only show that (i) and (iii) are equivalent. This we do below. 

(a) Suppose (i) holds. Then dim(Ker F) = 0, and so the above equation tells us that dim V = dim(Im F). 
This means V = Im F or, in other words, F is an onto mapping. Thus, (i) implies (iii). 

(b) Suppose (iii) holds. Then V = Im F. and so dim V = dim(Im F). Therefore, the above equation tells 
us that dim(Ker F) — 0, and so F is nonsingular. Therefore, (iii) implies (i). 

Accordingly, all four conditions are equivalent. 

Remark: Suppose A is a square n x n matrix over K. Then A may be viewed as a linear operator on 
K n . Because K" has finite dimension, Theorem 5.14 holds for the square matrix A. This is why the terms 
“nonsingular” and “invertible” are used interchangeably when applied to square matrices. 

EXAMPLE 5.13 Let F be the linear operator on R 2 defined by F(x. y) = (2x + y, 3x + 2 y). 

(a) To show that F is invertible, we need only show that F is nonsingular. Set F(x,y) = (0,0) to obtain the 
homogeneous system 
2x + y — 0 and 3x + 2y — 0 
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Solve for x and y to get x = 0, y = 0. Hence, F is nonsingular and so invertible. 

(b) To find a formula for F , we set F(x,y) = ( s,t ) and so F _1 (s, f) = (x,y). We have 

2^ _i_ _ ^ 

(2x + y, 3x + 2 y) — ( s, t ) or 

3x + 2y — t 

Solve for x and y in terms of s and t to obtain x = 2s — t, y = — 3s + 2 1. Thus, 
F~ l (s, t) — (2s — t, —3s + 2t) or F^ 1 (x, y) — (2x — y, — 3x + 2y) 
where we rewrite the formula for F ~ 1 using x and y instead of s and t. 


SOLVED PROBLEMS 


Mappings 

5 . 1 . State whether each diagram in Fig. 5-3 defines a mapping from A — {a,b,c} into B = {x,y, z}. 

(a) No. There is nothing assigned to the element b £ A. 

(b) No. Two elements, x and z, are assigned to c £ A. 

(c) Yes. 



Figure 5-3 


5 . 2 . Let /: A —> B and g : B — > C be defined by Fig. 5-4. 

(a) Find the composition mapping (g °f): A — > C. 

(b) Find the images of the mappings /, g, g °f. 



(a) Use the definition of the composition mapping to compute 


(g°f) (o) = g(f(a)) = g(y) = t, (g°f) {b) = g(f(b)) = g(x) = s 

C g°f) (c) = g(f(c )) = g{y) = f 


Observe that we arrive at the same answer if we “follow the an'ows” in Fig. 5-4: 

a^> y —► t, b^> x-* s, c^y -> t 


(b) By Fig. 5-4, the image values under the mapping / are x and y, and the image values under g are r, s, t. 
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Hence, 


Im / = {■*, >’} and Im^ = {r,s, t} 

Also, by part (a), the image values under the composition mapping g°f are t and s\ accordingly, 
Img°/= {s, t}. Note that the images of g and g°f are different. 


5.3. Consider the mapping F : R J —> R 2 defined by F(x,y,z ) = ( yz,x 2 ). Find 

(a) F( 2,3,4); (b) F( 5, —2,7); (c) F _1 (0,0), that is, all »eR 3 such that F(v) = 0. 

(a) Substitute in the formula for F to get F( 2, 3,4) = (3 • 4, 2 2 ) = (12,4). 

(b) F( 5,-2,7) = (—2 • 7,5 2 ) = (-14,25). 

(c) Set F(v) = 0, where v = (x,y,z), and then solve for x, y, z: 

F(x,y,z) = (yz^x 2 ) = (0, 0) or yz = 0,x 2 = 0 

Thus, x = 0 and either y = 0 or z = 0. In other words, x = 0, y = 0 or x = 0, z = 0 —that is, the z-axis 
and the y-axis. 


5.4. Consider the mapping F: R 2 —» R 2 defined by F(x,y) = (3y,2x). Let S be the unit circle in R 2 , 
that is, the solution set of x 2 + y 2 = 1. (a) Describe F(S). (b) Find F ~ 1 (.S'). 

(a) Let (a,b) be an element of F(S). Then there exists (x,y) E S such that F(x,y) = (a,b). Hence, 

(3y, 2x) = (a, b) or 3y = a, 2x = b or y = -, x = - 

Because (x,y) E S —that is, x 2 + y 2 = 1—we have 



Thus, F(S) is an ellipse. 

(b) Let F{x,y) = (a, b), where (a, b) E S. Then (3y, lx) = (a, b) or 3y = a, 2x = b. Because (a, b) E S, we 
have a 2 + b 2 = 1. Thus, (3y) 2 + (2x) 2 = 1. Accordingly, F~ 1 (S) is the ellipse 4.r 2 + 9y 2 = 1. 

5.5. Let the mappings / : A B, g : B ^ C, h : C ^ D be defined by Fig. 5-5. Determine whether or 

not each function is (a) one-to-one; (b) onto; (c) invertible (i.e., has an inverse). 

(a) The mapping /: A —* B is one-to-one, as each element of A has a different image. The mapping 
g : B —> C is not one-to one, because x and z both have the same image 4. The mapping h : C —> D is 
one-to-one. 

(b) The mapping / : A —> B is not onto, because z E B is not the image of any element of A. The mapping 
g : B —> C is onto, as each element of C is the image of some element of B. The mapping h : C —> O is 
also onto. 

(c) A mapping has an inverse if and only if it is one-to-one and onto. Hence, only h has an inverse. 


A f B g C h D 
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5 . 6 . Suppose /: A —> B and g : B —>■ C. Hence, (g °f) : A —> C exists. Prove 

(a) If / and g are one-to-one, then g °/ is one-to-one. 

(b) If / and g are onto mappings, then g °f is an onto mapping. 

(c) If g of is one-to-one, then f is one-to-one. 

(d) If g of is an onto mapping, then g is an onto mapping. 

(a) Suppose (g°/)(x) = (,g°/)(y). Then g(/(x)) = g(/(y)). Because g is one-to-one, f(x) = /(y). Because 
/ is one-to-one, x = y. We have proven that (g °f) (x) = (g °f) (y) impliesx = y; hence g°f is one-to-one. 

(b) Suppose c E C. Because g is onto, there exists b E B for which g(b) = c. Because/ is onto, there exists 
a E A for which/(a) = b. Thus, (g°/)(a) = g{f{a)) = g{b) = c. Hence, g°f is onto. 

(c) Suppose / is not one-to-one. Then there exist distinct elements x, y E A for which /(x) = /(y). Thus, 
(g°/)(x) = g(f(x)) = g(/(y)) = (g°/)(y). Hence, g°f is not one-to-one. Therefore, if g°f is one-to- 
one, then / must be one-to-one. 

(d) If a E A, then (g°/)(a) = g(/(a)) E g(B). Hence, (g°/)(A) C g(B). Suppose g is not onto. Then g(B) 
is properly contained in C and so (g°/)(A) is properly contained in C; thus, g°f is not onto. 
Accordingly, if g °f is onto, then g must be onto. 

5 . 7 . Prove that /: A —> B has an inverse if and only if / is one-to-one and onto. 

Suppose / has an inverse—that is, there exists a function f~ l :B^>A for which f ~ 1 °/ = 1 A and 
/o/ _1 = \ B . Because 1 A is one-to-one, / is one-to-one by Problem 5.6(c), and because 1 B is onto, / is onto by 
Problem 5.6(d); that is, / is both one-to-one and onto. 

Now suppose / is both one-to-one and onto. Then each b E B is the image of a unique element in A, say 
b*. Thus, if/(a) = b , then a = b*\ hence, f(b*) = b. Now let g denote the mapping from B to A defined by 
b i—>■ b*. We have 

(i) ( g°f)(a ) = g(/(a)) = gib) = b* = a for every a E A; hence, g°f = 1 A . 

(ii) ( f°g)(b ) =f(g(b)) = f{b*) = b for every b E B: hence, fog = \ B . 

Accordingly, / has an inverse. Its inverse is the mapping g. 

5 . 8 . Let/: R —> R be defined by/(jc) = lx — 3. Now/ is one-to-one and onto; hence, / has an inverse 
mappingFind a formula for f~ l . 

Let y be the image of x under the mapping/; that is, y =/(x) = 2x — 3. Hence, x will be the image of y 
under the inverse mapping f ~ 1 . Thus, solve for x in ternis of y in the above equation to obtain x = 1 (y + 3 ) . Then 

the fonnula defining the inverse function is / -1 (y) = \ (y + 3), or, using x instead of v, f~ l (x) = 1 (x + 3). 

Linear Mappings 

5 . 9 . Suppose the mapping F: R 2 —> R 2 is defined by Fix. y) = (x + y, x). Show that F is linear. 

We need to show that F(v + w) = F(v) + F(w) and F(kv) = kF(v), where u and v are any elements of 
R 2 and k is any scalar. Let v = ( a,b ) and w = ( a',b '). Then 

v + w = {a + a’, b + b’) and kv = ( ka, kb) 

We have F(v) = {a + b, a) and F(w) = ia' + b\ a'). Thus, 

F{v + w) = F(a + a', b + b') = (a + a' + b + b 1 , a + a') 

= (a + b, a) + ( a' + b r , a') = F(v) + F(w) 

and 

F(kv) = F{ka, kb) = (ka + kb, ka) = k(a + b, a) = kF(v) 

Because v, w, k were arbitrary, F is linear. 
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5.10. Suppose F: R 3 —> R 2 is defined by F{x,y, z) — (x + y + z, 2x — 3 y + 4z). Show that F is linear. 

We argue via matrices. Writing vectors as columns, the mapping F may be written in the form 
F{v) = Av, where v = [x,y, z] r and 

A= \ 1 1 *1 
A [2 -3 4 

Then, using properties of matrices, we have 

F(v + w) = A{v + w) = Av + Aw = F{v) + F(w) 

and F{kv) = A(kv) = k{Av) = kF{v) 

Thus, F is linear. 

5.11. Show that the following mappings are not linear: 

(a) F : R 2 —> R 2 defined by F(x, y) = (xy, x ) 

(b) F: R 2 —> R 1 defined by F(x,y) = {x + 3, 2 y, x + y) 

(c) F: R 3 —> R 2 defined by F(x,y,z) = (|x|, y + z) 

(a) Let v = (1,2) and w = (3,4); then v + w = (4,6). Also, 

F(t;) = (1(2), 1) = (2,1) and F(w) = (3(4), 3) = (12,3) 

Hence, 

F(v + w) = (4(6),4) = (24,6 ) ? F(v) + F(yv) 

(b) Because F(0,0) = (3,0,0) ^ (0,0,0), F cannot be linear. 

(c) Let v = (1,2, 3) and k = —3. Then kv = (—3, —6, —9). We have 

F(v) = ( 1,5) and kF(v) = - 3(1,5) = (-3, -15). 

Thus, 

F(kv) = F(-3, -6, -9) = (3, -15) / kF(v) 

Accordingly, F is not linear. 

5.12. Let V be the vector space of /(-square real matrices. Let M be an arbitrary but fixed matrix in V. Let 
F : V —» V be defined by F(A ') = AM + MA, where A is any matrix in V. Show that F is linear. 

For any matrices A and B in V and any scalar k, we have 

F(A + B) = (A + B)M + M(A + B) = AM + BM + MA + MB 
= {AM + MA) = {BM + MB) = F{A) + F{B) 

and 

F{kA) = {kA)M + M{kA) = k{AM) + k{MA) = k{AM + MA) = kF{A) 

Thus, F is linear. 

5.13. Prove Theorem 5.2: Let V and U be vector spaces over a field K. Let {uj, v 2 , ■ ■ ■, v n } be a basis of 
V and let u 1 ,u 2 ,..., u„ be any vectors in U. Then there exists a unique linear mapping F : V —> U 
such that FM = u u F(v 2 ) = u 2 ,... ,F(v n ) = u„. 

There are three steps to the proof of the theorem: (1) Define the mapping F: V —> U such that 
F(v t ) = Uj, i = 1(2) Show that F is linear. (3) Show that F is unique. 

Step 1. Let v G V. Because {iq,..., v„} is a basis of V, there exist unique scalars a 1 ,..., a n 6 K for which 
v = a t + a 2 v 2 +-h a n v„. We define F : V —> U by 

F(v) = a x ii\ + a 2 u 2 + ■ ■ ■ + a n ii n 
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(Because the a f are unique, the mapping F is well defined.) Now, for i = 1 

V, = Orq +-hi Vj H-h 0v n 

Hence, 

F(Vj) = 0 u l H-h 1 m, H-h 0 u„ = iij 

Thus, the first step of the proof is complete. 

Step 2. Suppose v = a l v l + a 2 v 2 H-+ a n v n and w = b l tq + b 2 v 2 + ■ • • + b n v n . Then 

v + w = («! + b l )v 1 + (a 2 + b 2 )v 2 + ••• + («„ + b„)v„ 
and, for any k 6 K, kv = ka l v l + ka 2 v 2 + • • • + ka n v n . By definition of the mapping F, 

F(v) = a x ii\ + a 2 u 2 + • • • + a n v„ and F(w) = b i u l + b 2 u 2 + • • • + b n u n 

Hence, 

F{v + w) = (a l + ZqK + (a 2 + b 2 )u 2 -\ -h (a„ + b„)u„ 

— (ci\U\ + a 2 u 2 + • • • + a n u ti) "T {b\U\ + b 2 u 2 + • • • + b n u n ) 

= F(v) + F(w) 

and 

F(kv ) = k(a l Ui + a 2 u 2 + • • • + a n u n ) = kF(v) 

Thus, F is linear. 

Step 3. Suppose G : V —► U is linear and G(w,) = u h i = 1Let 

v = a l v 1 + a 2 v 2 H-h a n v n 

Then 

G(v) = G(a l v l + a 2 v 2 -\ -h a„v„) = a x G{vx) + a 2 G{v 2 ) H-h a„G(v n ) 

= a x u x + a 2 u 2 + ■ ■ • + a n ii n = F(v) 

Because G(v) = F(v) for every v G V, G = F. Thus, F is unique and the theorem is proved. 

5.14. Let F : R 2 —> R 2 be the linear mapping for which F (1,2) = (2, 3) and F( 0, 1) = (1,4). [Note that 
{(1,2), (0, 1)} is a basis of R 2 , so such a linear map F exists and is unique by Theorem 5.2.] Find a 
formula for F; that is, find F(a, b). 

Write (a, b) as a linear combination of (1,2) and (0,1) using unknowns x and y, 

(a,b)=x( 1,2)+y(0,1) = (x, 2x + y), so a = x, b = 2x + y 
Solve for x and y in terms of a and b to get x = a, y = —2a + b. Then 

F(a , b) = xF{ 1,2) + yF{ 0,1) = a(2, 3) + (—2a + b)(l,4) = (b, —5a + 4 b) 


5 . 15 . Suppose a linear mapping F: V —> U is one-to-one and onto. Show that the inverse mapping 
: U —»• V is also linear. 

Suppose m, u! £ U. Because F is one-to-one and onto, there exist unique vectors v, t/ G V for which 
F(v) = u and F(v/) = u!. Because F is linear, we also have 

F(v + v') = F(v) + F(v') = u + u and F(kv) = kF(v) = ku 

By definition of the inverse mapping. 


Then 


F 1 (u) = v, F *(«') = t/, F l (u + u') = v + v', F l (ku)=kv. 

F~ x (u + u) = v + v' = («) + F~ l (u) and F~ l (ku) = kv = kF~ l (m) 


Thus, F 1 is linear. 
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Kernel and Image of Linear Mappings 

5.16. Let F : R 4 —> R 3 be the linear mapping defined by 

F(x,y,z,t) — (x +y + z +1, x + 2z-t, x + y + 3z — 3t) 

Find a basis and the dimension of (a) the image of F. (b) the kernel of F. 

(a) Find the images of the usual basis of R 4 : 

F(l, 0,0,0) = (1,1,1), ^(0,0,1,0) = (1,2,3) 

F( 0,1.0,0) = (-1,0,1), F(0,0,0,1) = (1, -1,-3) 

By Proposition 5.4, the image vectors span Im F. Hence, form the matrix whose rows are these image 
vectors, and row reduce to echelon form: 


1 

1 

r 


1 

1 

f 


'1 

1 

r 

-1 

0 

i 


0 

1 

2 


0 

1 

2 

1 

2 

3 


0 

1 

2 


0 

0 

0 

1 

-1 

— 3_ 


0 

-2 

— 4_ 


0 

0 

0 


Thus, (1,1,1) and (0,1,2) fomi a basis for Im F; hence, dim(Im F) = 2. 

(b) Set F(v) = 0, where v = (x,y,z, t)\ that is, set 

F(x, y, z, t) = (x - y + z + t, x + 2 z-t, x + y + 3z~ 3 1) = (0,0,0) 

Set corresponding entries equal to each other to form the following homogeneous system whose solution 
space is Ker F: 


x-y+ z + t = 0 x-y+ z + t = 0 

x + 2z — t = 0 or y + z — 2t = 0 or 

x + y + 3z — 3f = 0 2y + 2z — At = 0 

The free variables are z and t. Hence, dim(Ker F) = 2. 

(i) Set z = — 1, t = 0 to obtain the solution (2,1, —1.0). 

(ii) Set z = 0, t = 1 to obtain the solution (1,2,0,1). 


x-y + z+ t = 0 
y + z-2t = 0 


Thus, (2,1, —1,0) and (1,2,0,1) form a basis of Ker F. 

[As expected, dim(Im F) + dim(Ker F) = 2 + 2 = 4 = dimR 4 , the domain of F.J 


5.17. Let G: R 3 —> R 3 be the linear mapping defined by 

G(x, y, z) — (x + 2y - z, y + z, x + y - 2 z) 

Find a basis and the dimension of (a) the image of G, (b) the kernel of G. 

(a) Find the images of the usual basis of R 3 : 

G(1,0,0) = (1,0,1), G(0,1,0) = (2,1,1), G(0,0,1) = (-1,1, -2) 

By Proposition 5.4, the image vectors span Im G. Hence, form the matrix M whose rows are these image 
vectors, and row reduce to echelon form: 



- 1 

0 

r 


'1 

0 

r 


■i 

0 

1 - 

M = 

2 

1 

l 

~ 

0 

1 

-l 

~ 

0 

1 

-1 


.-1 

1 

-2. 


.0 

1 

-l. 


.0 

0 

0 _ 


Thus, (1,0,1) and (0,1,-1) form a basis for Im G; hence, dim(Im G) = 2. 

(b) Set G(v) = 0, where v = (x,y,z)\ that is, 

G(x, y, z) = (x + 2y - z, y + z, x + y - 2z) = (0,0,0) 
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Set corresponding entries equal to each other to form the following homogeneous system whose solution 
space is Ker G: 


x + 2y — z = 0 
y + z = 0 


x + y — 2z = 0 


x+2y — z = 0 
or y + z = 0 

-y — z = 0 


or 


x + 2y — z = 0 
y + z = 0 


The only free variable is z; hence, dim(Ker G) = 1. Set z = 1; then y = —1 and x = 3. Thus, (3, —1,1) 
forms a basis of Ker G. [As expected, dim(Im G) + dim(Ker G) = 2+1 = 3 = dimR 3 , the domain 
of G.l 


5.18. Consider the matrix mapping A: R 4 —» R \ where A 


I 

1 

3 


2 

3 

8 


3 

5 

13 


1 

-2 

-3 


. Find a basis and the 


dimension of (a) the image of A, (b) the kernel of A. 


(a) The column space of A is equal to Im A. Now reduce A T to echelon form: 


"1 

1 

3" 


"1 

1 

3" 


"1 

1 

3' 

2 

3 

8 


0 

1 

2 


0 

1 

2 

3 

5 

13 


0 

2 

4 


0 

0 

0 

1 

-2 

-3 


0 

-3 

-6 


0 

0 

0 


Thus, {(1,1,3), (0,1,2)} is a basis of Im A, and dim(Im A) = 2. 

(b) Here Ker A is the solution space of the homogeneous system AX = 0, where X = {x,y,z, t) T . Thus, 
reduce the matrix A of coefficients to echelon form: 


'1 

2 

3 

r 


'1 

2 

3 

r 

0 

1 

2 

-3 

~ 

0 

1 

2 

-3 

0 

2 

4 

-6 


0 

0 

0 

0 


The free variables are z and t. Thus, dim(Ker A) = 2. 

(i) Set z = 1, t = 0 to get the solution (1, —2,1, 0). 

(ii) Set z = 0, t = 1 to get the solution (—7, 3,0,1). 
Thus, (1, —2,1.0) and (—7, 3,0,1) form a basis for Ker A. 


x -{- 2_y + 3z T" t — 0 
y + 2z - 3r = 0 


5.19. Find a linear map F: R 3 —> R 4 whose image is spanned by (1,2,0, —4) and (2,0, —1, —3). 


Form a 4 x 3 matrix whose columns consist only of the given vectors, say 


A = 


1 

2 

0 

-4 


2 

0 

-1 

-3 


2 

0 

-1 

-3 


Recall that A determines a linear map A: R 3 —» R 4 whose image is spanned by the columns of A. Thus, A 
satisfies the required condition. 


5.20. Suppose /: V —> U is linear with kernel W, and that f(v) = u. Show that the “coset” 
v + W = {v + w: wG W} is the preimage of n; that is, (u) = v+W. 

We must prove that (i )/“*(«) C v+W and (ii) v+W C/ _1 (m). 

We first prove (i). Suppose t/ G/ _1 (m). Then/(t/) = u, and so 

f(v' - v) =f{v') -f(v) = u - u = 0 

that is, v' — v G W. Thus, v' = v + (t/ — v) G v + W, and hence / _1 (m) C v + W. 
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Now we prove (ii). Suppose 1 / G v + W. Then v' = v + w, where w G W. Because W is the kernel of/, 
we have/(w) = 0. Accordingly, 

/(/) =f( v + w) +f(v) +f(w) =f(v) + 0 =f(v) = u 

Thus, v' G/ _1 («), and so v + W C/ _1 (n). 

Both inclusions imply f~ l (u) = v + W. 

5.21. Suppose F: V —> U and G: U ^ W are linear. Prove 

(a) rank(G°F) < rank(G), (b) rank(G°F) < rank(F). 

(a) Because F(V) C U, we also have G(F(V)) C G(U), and so dim[G(F(V))] < dim[G(t/)]. Then 
rank(G°F) = dim[(G 0 F)(V)] = dim[G(F(T))] < dim[G({/)] = rank(G). 

(b) We have dim[G(F(V))] < dim[F(V)]. Hence, 

rank(G°F) = dim[(G°F)(y)] = dim[G(F(V))] < dim[F(V)] = rank(F) 

5.22. Prove Theorem 5.3: Let F: V ^ U be linear. Then, 

(a) Im F is a subspace of U, (b) Ker F is a subspace of V. 

(a) Because F(0) = 0, we have 0 6 Im F. Now suppose n, u! G Im F and a,b G K. Because u and u! belong 
to the image of F, there exist vectors v, 1 / G V such that F( v) = u and F(v r ) = u'. Then 

F(av + bv') = aF(v) + bF(v') = au + bu’ G Im F 

Thus, the image of F is a subspace of U. 

(b) Because F(0) = 0, we have 0 G Ker F. Now suppose v, w G Ker F and a,b G K. Because v and w 
belong to the kernel of F, F(v) = 0 and F(w) = 0. Thus, 

F(av + bw) = aF(v) + bF(w) = a0 + bO = 0 + 0 = 0, and so av + bw G Ker F 

Thus, the kernel of F is a subspace of V. 

5.23. Prove Theorem 5.6: Suppose V has finite dimension and F: V —> U is linear. Then 

dim V = dim(Ker F) + dim(Im F) = nullity(F) + rank(F) 

Suppose dim(Ker F) = r and {u'!,..., w r } is a basis of Ker F, and suppose dim(Im F) = s and 
{m 1; ..., u s } is a basis of Im F. (By Proposition 5.4, Im F has finite dimension.) Because every Uj G Im F, 
there exist vectors u 1; ... ,v s in V such that F(vi) = u u ... ,F(v s ) = u. s . We claim that the set 

B= {w l ,...,w r ,v 1 ,...,v s } 

is a basis of V, that is, (i) B spans V, and (ii) B is linearly independent. Once we prove (i) and (ii), then 
dim V = r + s = dim(Ker F) + dim(Im F). 

(i) B spans V. Let v G V. Then F(u) G Im F. Because the Uj span Im F, there exist scalars a,...., a s such 
that F(u) = a 1 w 1 + • • • + a s ii s . Set v = a l v 1 + • • • + a s v s — v. Then 

F{v) = F(a l v l + ■■■+a s v s -v) = a 1 F(u 1 ) + • • • + a s F(v s ) - F(v) 

= «!«! + ••• + a s ii s — F(v) =0 

Thus, v G Ker F. Because the vv,- span Ker F, there exist scalars b \,..., b r , such that 
v = b\W x + ■ ■ ■ + b r w r = a l v l + ■ ■ ■ + a s v s — v 

Accordingly, 

v = Vi + • • • + a s v s />|tf| • b r w r 

Thus, B spans V. 
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(ii) B is linearly independent. Suppose 


x t w t H-1- x r w r +y 1 v l H-h= 0 


( 1 ) 


where x h )’j G K. Then 

0 = F(0) = F{x l w l + • • ■+x r w,.+y l v l + ■ • ■ +y s v s ) 

= x x F{w i) + • • • +x r F(w r ) +y\F{v\) + • • • + y,F(t;J (2) 

But F(wj) = 0, since vty G Ker F, and F(vj) = Uj. Substituting into (2), we will obtain 
y 1 u 1 + • • • +y s u s = 0. Since the iij are linearly independent, each y 7 - = 0. Substitution into (1) gives 
Xiw 1 + • • • + x r w r = 0. Since the w ; are linearly independent, each x ; = 0. Thus B is linearly 
independent. 


Singular and Nonsingular Linear Maps, Isomorphisms 

5.24. Determine whether or not each of the following linear maps is nonsingular. If not, find a nonzero 
vector v whose image is 0. 

(a) F : R“ —> R 2 defined by F(x,y) = (x — y, x—2y). 

(b) G: R 2 —> R 2 defined by G(x,y) = (2x — 4y, 3x — 6y). 


(a) Find Ker F by setting F(v) = 0, where v = (x,y). 


(x — y, x — 2y) = (0,0) or 


x — y = 0 
x — 2y = 0 


The only solution is x = 0, y = 0. Hence, F is nonsingular, 
(b) Set G(x,y) = (0,0) to find Ker G: 


(2x — 4y, 3x — 6y) = (0,0) or 


2x - 4y = 0 
3x - 6 v = 0 


or 


x-y = 0 

-y = o 


or x — 2y = 0 


The system has nonzero solutions, because y is a free variable. Hence, G is singular. Let y — I to obtain 
the solution v = (2,1), which is a nonzero vector, such that G(v) = 0. 


5.25. The linear map F: R 2 —» R 2 defined by F(x,y ) = (x — y, x — 2y) is nonsingular by the previous 
Problem 5.24. Find a formula for F 1 . 

Set F(x.y) = ( a,b ), so that F~ 1 (a 1 b) = (x, y). We have 


(x — y, x — 2y ) = (a, b ) or 


x — y = a 
x - 2 y = b 


or 


x — y = a 
y = a — b 


Solve for x and y in terms of a and b to get x = 2 a — b. y = a — b. Thus, 

F~ x (a, b) = (2a — b, a — b) or F _1 (x,y) = (2x — v, x — y) 
(The second equation is obtained by replacing a and b by x and y, respectively.) 


5.26. Let G: R 2 —> R 3 be defined by G(x,y) = (x + y, x — 2_y, 3x + y). 

(a) Show that G is nonsingular, (b) Find a formula for G 1 . 

(a) Set G(x,y) = (0,0,0) to find Ker G. We have 

(x + y, x — 2y, 3x + y) = (0,0,0) or x + v = 0, x — 2y = 0, 3x + y = 0 
The only solution is x = 0, y = 0; hence, G is nonsingular. 

(b) Although G is nonsingular, it is not invertible, because R 2 and IT have different dimensions. (Thus, 
Theorem 5.9 does not apply.) Accordingly, G~ l does not exist. 
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5.27. Suppose that F: V —> U is linear and that V is of finite dimension. Show that V and the image of F 
have the same dimension if and only if F is nonsingular. Determine all nonsingular linear mappings 

/ R 1 R 

By Theorem 5.6, dim V = dim(Im F) + dim(Ker F). Hence, V and Im F have the same dimension if 
and only if dim(Ker F) = 0 or Ker F = {0} (i.e., if and only if F is nonsingular). 

Because dimR 3 is less than dimR 4 , we have that dim(Im T) is less than the dimension of the domain R 4 
of T. Accordingly no linear mapping T : R 4 —> R 3 can be nonsingular. 

5.28. Prove Theorem 5.7: Let F: V —> V be a nonsingular linear mapping. Then the image of any 
linearly independent set is linearly independent. 

Suppose Wj, v 2 ,..., v„ are linearly independent vectors in V. We claim that F(vi),F(v 2 ), ■ ■ ■, F(v n ) are 
also linearly independent. Suppose a 1 F(w 1 ) + a 2 F{v 2 ) + • • • + a„F(v n ) = 0, where a t G K. Because F is 
linear, T’(a 1 t; 1 + a 2 v 2 + • • • + a n v„) = 0. Hence, 

a i v i + a 2 v 2 H-h a n v„ G Ker F 

But F is nonsingular—that is, KerT={0}. Hence, a l v l + a 2 v 2 + ■ ■ ■ + a n v n = 0. Because the v t are 
linearly independent, all the a, are 0. Accordingly, the F(v t ) are linearly independent. Thus, the theorem is 
proved. 

5.29. Prove Theorem 5.9: Suppose V has finite dimension and dim V = dim U. Suppose F : V —> U is 
linear. Then F is an isomorphism if and only if F is nonsingular. 

If F is an isomorphism, then only 0 maps to 0; hence, F is nonsingular. Conversely, suppose F is 
nonsingular. Then dim(Ker F) = 0. By Theorem 5.6, dim V = dim(Ker F) + dim(Im F). Thus, 

dim U = dim V = dim(Im F) 

Because U has finite dimension, Im F = U. This means F maps V onto U. Thus, F is one-to-one and onto; 
that is, F is an isomorphism. 

Operations with Linear Maps 

5.30. Define F : R —» R 2 and G: R 3 —> R 2 by F(x, y, z) = (2x, y + z ) and G(x, y, z) — (x — z, y). Find 
formulas defining the maps: (a) F + G, (b) 3 F, (c) 2F — 5 G. 

(a) (F + G) {x, y, z) = F(x, y, z) + G(x, y, z) = ( 2x , y + z) + (x - z, y) = (3 x - z, 2v + z) 

(b) (3 F)(x,y,z) = 3 F(x,y,z) = 3(2x, y + z) = (6x, 3y + 3 z) 

(c) (2F — 5G)(x,y,z) = 2F(x,y,z) — 5G(x,y,z) = 2(2x, y + z)-5(x-z, y) 

= (4x, 2y + 2z) + (—5x + 5 z, -5y) = (—x + 5z, —3y + 2z) 

5.31. Let F: R 3 —> R 2 and G : R 2 —> R 2 be defined by F(x. y. z) — (2x, y + z) and G(x, y) — (y. x). 
Derive formulas defining the mappings: (a) G°F, (b) F°G. 

(a) (G°F)(x,y,z) = G(F(x,y,z)) = G( 2x, y + z) = (y + z, 2x) 

(b) The mapping F ° G is not defined, because the image of G is not contained in the domain of F. 

5.32. Prove: (a) The zero mapping 0, defined by 0(c) = 0 G U for every v € V, is the zero element of 
Hom(V, U ). (b) The negative of F G Hom(T, U) is the mapping (— l)F, that is, — F = (—1)7? 

Let F € Hom(T, U). Then, for every v 6 V: 

(a) (F + 0)(t?) = F(v) + 0(w) = F(v) + 0 = F(v) 

Because (F + 0)(u) = F(v) for every v G V, we have F + 0 = F. Similarly, 0 + F = F. 

(b) (F + (-l)F)(u) = F(v) + (-1 )F(v) = F(v) - F(v) = 0 = 0(v) 

Thus, F + (—1)F = 0. Similarly (~l)F + F = 0. Hence, —F = (-l)T. 
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5.33. Suppose F x . F 2 ,.... F n are linear maps from V into U. Show that, for any scalars a l ,a 2 , ■ ■ ■ ,a n , 
and for any v £ V, 

(a l F l + a 2 F 2 + • • ■ + a n F„)(v ) = a^F^v) + a 2 F 2 (v) + • • • + a n F n (v) 

The mapping a l F 1 is defined by (a 1 F 1 )(v) = a x F{v). Hence, the theorem holds for n = 1. Accordingly, 
by induction, 

(a x F x + a 2 F 2 + ■ ■ ■ + a n F n ){v) = (a 1 F 1 )(u) + ( a 2 F 2 + ■ ■ ■ + a n F n ){v) 

= aiFi(v) + a 2 F 2 (v) + • • • + a n F n (v) 

5.34. Consider linear mappings F: R 3 —> R 2 , G : R — R 2 , H : R 3 ^ R 2 defined by 

F(x, y,z) = [x + y + z, x + y), G(x, y, z) = {2x + z, x + y), H(x, y, z) = (2 y, x) 

Show that F, G, FI are linearly independent [as elements of Hom(R 3 ,R 2 )]. 

Suppose, for scalars a,b,c G K, 

aF+bG+cH = 0 (1) 

(Here 0 is the zero mapping.) For e x = (1,0,0) G R 3 , we have 0(ci) = (0,0) and 

(aF + bG+ cH) ( ei ) = aF( 1,0,0) + bG(l , 0,0) + cH{ 1,0,0) 

= a( 1,1) + b( 2,1) + c(0, l) = (a + 2b, a + b + c) 

Thus by (1), (a + 2b, a + b + c) = (0,0) and so 

a + 2b = 0 and a + b + c = 0 (2) 

Similarly for e 2 = (0,1,0) G R 3 , we have 0(e 2 ) = (0,0) and 

{aF + bG + cH) (e 2 ) = aF{0, 1,0) + bG(0, 1,0) + cH (0,1,0) 

= a{ 1,1) + b{ 0,1) + c(2,0) = {a + 2c, a + b) 

^ US ' a + 2c = 0 and a + b = 0 (3) 

Using (2) and (3), we obtain 

a = 0, b = 0, c = 0 (4) 

Because (1) implies (4), the mappings F, G, H are linearly independent. 

5.35. Let k be a nonzero scalar. Show that a linear map T is singular if and only if kT is singular. Hence, 
T is singular if and only if — T is singular. 

Suppose T is singular. Then T(v) = 0 for some vector v ^ 0. Hence, 

{kT){v) = kT{v) = kO = 0 

and so kT is singular. 

Now suppose kT is singular. Then ( kT)(w ) = 0 for some vector w ^ 0. Hence, 

T{kw) = kT{w) = ( kT){w ) = 0 

But k ^ 0 and w ^ 0 implies kw ^ 0. Thus, T is also singular. 

5.36. Find the dimension d of: 

(a) Hom(R 3 ,R 4 ), (b) Hom(R 5 ,R 3 ), (c) Hom(P 3 (t),R 2 ), (d) Hom(M 2 3 ,R 4 ). 

Use dim[Hom(V) U)] = mn, where dim V = m and dim U = n. 

(a) d = 3(4) = 12. (c) Because dimP 3 (t) = 4, d = 4(2) = 8. 

(b) d = 5(3) = 15. (d) Because dimM 23 = 6, d = 6(4) = 24. 
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5.38. Prove Theorem 5.11. Suppose dim V — m and dim U = n. Then dim[Hom(V, U )] = inn. 

Suppose {iy,..., v m } is a basis of V and {u l ,..., u n ] is a basis of U. By Theorem 5.2, a linear mapping 
in Hom(V, U) is uniquely determined by arbitrarily assigning elements of LJ to the basis elements ty of V. We 
define 

Fy S Hom( V, U), i = 1,..., m, j=l,...,n 

to be the linear mapping for which F,y(ty) = Uj, and F^(ty) = 0 for k i. That is, F y maps ty into iij and the 
other v’s into 0. Observe that {F y } contains exactly mn elements; hence, the theorem is proved if we show 
that it is a basis of Hom(V, U). 

Proof that {Fjj} generates Hom(V,U). Consider an arbitrary function F 6 Hom(F, (/). Suppose 
F(ty) = Wi,F(v 2 ) = w 2 , ■ ■ ■ ,F(v m ) = w m . Because w k G U, it is a linear combination of the m’s; say, 

Wk = «*i“i +a k2 u 2 4- 'ra kn u n , k=l,...,m, a tj e K (1) 

Consider the linear mapping G = Yl?=i Xy_ i a yF,y- Because G is a linear combination of the F-, the proof 
that {Fjj} generates Hom(V, U) is complete if we show that F = G. 

We now compute G{v k )\k = 1,..., m. Because Fj(v k ) = 0 for k i and F ki (v k ) = 

m n n n 

g m = J2J2 a <j F ijM = a kjF kj (v k ) =J2 a kjUj 

i= 1 j= 1 j= 1 7=1 

= a k i u i + a k 2 u 2 + -1" tl kn It,, 

Thus, by (1), G(v k ) = w k for each k. But F{v k ) = w k for each k. Accordingly, by Theorem 5.2, F = G; 
hence, {F y } generates Hom(T, U). 

Proof that {F y } is linearly independent. Suppose, for scalars c y 6 K, 

m n 

Y. Y. c ‘i F 'j ^ 0 

i=l j=l 

For v k , k — 1,..., m, 

m n n n 

0 = 0K)=££ CjFjivk) = CkjFkjM = Y c kJ u i 

>'= 1 7=1 7=1 7=1 

= C kl u 1 + C k2 u 2 5-b C kn U n 

But the Uj are linearly independent; hence, for k = 1we have cy,; = 0; c k2 = 0,..., c kn = 0. In other 
words, all the c y = 0, and so {F y } is linearly independent. 


5.38. Prove Theorem 5.12: (i) G° (F + F') = G°F + G°F'. (ii) {G + G')° F = G° F + G' ° F. 
(iii) k(G°F) = (kG) °F = G° (kF). 

(i) For every v G V, 

(G° (F + F'))(v) = G((F + F')(v)) = G{F{v) + F'(v)) 

= G(F(v)) + G(F'{v)) = (G"F)(w) + (G°F')(v) = (G° F + G° F’)(v) 

Thus, G° (F + F') = G°F+G°F'. 

(ii) For every v G V, 

((G + G')oF)(v ) = (G + G')(F(v)) = G(F(v)) + G'(F(v )) 

= (G°F)(v) + (G'°f)(v) = (G°F+G'°F)(v) 

Thus, (G + G')°F = G°F+G'°F. 
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(iii) For every 

(k(G°F))(v) = k(G°F)(v) = k(G(F(v))) = ( kG)(F{v )) = (kG°F)(v) 

and 

(k(G°F))(v) = k(G°F)(v) = k(G(F(v))) = G(kF(v)) = G((kF)(v)) = (G°kF)(v) 

Accordingly, k(G°F) = (kG) °F = G° ( kF ). (We emphasize that two mappings are shown to be equal 
by showing that each of them assigns the same image to each point in the domain.) 

Algebra of Linear Maps 

5.39. Let F and G be the linear operators on R 2 defined by F(x,y ) = (y,x) and G(x,y) = (0. x). Find 
formulas defining the following operators: 

(a) F + G, (b) 2 F - 3 G, (c) FG, (d) GF, (e) F 2 , (f) G 2 . 

(a) (F + G)(x,y) = F(x,y) + G{x,y) = (y,x) + (0,x) = (y,2x). 

(b) (2 F - 3 G)(x,y) = 2F(x,y) - 3 G(x,y) = 2(y,x) - 3(0, x) = (2y, -x). 

(c) (FG)(x, y) = F(G(x,yj) = F(0,x) = (x,0). 

(d) (GF)(.x,y) = G(F(x,y)) = G(y,x) = (0,y). 

(e) F 2 (x,y) = F(F(x,y)) = F(y,x ) = (x, y). (Note that F 2 = /, the identity mapping.) 

(f) G 2 (jc,y) = G(G(x,y)) = G(0,x) = (0,0). (Note that G 2 = 0, the zero mapping.) 

5.40. Consider the linear operator T on R 3 defined by T(x,y,z) = (2x, 4 x — y, 2x+3y — z). 

(a) Show that T is invertible. Find formulas for (b) T 1 , (c) T 2 , (d) T 2 . 

(a) Let W = Ker T. We need only show that T is nonsingular (i.e., that W = {0}). Set T(x,y,z) = (0,0,0), 
which yields 

T(x, y, z) = (2x, Ax - y, 2x + 3y - z) = (0,0,0) 

Thus, W is the solution space of the homogeneous system 

2x = 0, Ax — y = 0, 2x + 3y — z = 0 

which has only the trivial solution (0,0,0). Thus, W = {0}. Hence, T is nonsingular, and so T is 
invertible. 

(b) Set T{x,y,z) = ( r,s,t ) [and so T~ l (r,s,t) = (x,y, z)]. We have 

(2x, Ax — y, 2x + 3y — z) = (r, s , t) or 2x = r, Ax — y = s , 2x +3 y — z = t 
Solve for x, y, z in terms of r, s, 1 to get x = \r,y = 2r — s, z = lr — 3j — t. Thus, 

T~ l (r,s, t) = (^r, 2 r — s, lr-3s — t) or T~ i (x,y,z) = (|x, 2x-y, lx—3y~z) 

(c) Apply T twice to get 

T 2 (x,y,z) = T(2x, 4x - y, 2x + 3y - z) 

= [4x, 4(2x)(4.v y), 2(2x) + 3(4x - y) - (2x + 3y - z)] 

= (4x, 4x + y, 14x — 6y + z) 

(d) Apply T-' twice to get 

T~\x,y,z) = T~\\x, 2x — y, 7x-3 v - z) 

= [?x, 2(|x) — (2x — y), 7(|x) — 3(2x — y) — (7x — 3y — z)] 

= (|x, -x + y, -fx + 6v + z) 
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5.41. Let V be of finite dimension and let I be a linear operator on V for which TR = I , for some 
operator R on V. (We call R a right inverse of 7 .) 

(a) Show that T is invertible, (b) Show that R = T^ 1 . 

(c) Give an example showing that the above need not hold if V is of infinite dimension. 

(a) Let dim V = n. By Theorem 5.14, T is invertible if and only if T is onto; hence, T is invertible if and only 
if rank(J) = n. We have n = rank(7) = rank(77?) < rank(T) < n. Hence, rank(r) = n and T is 
invertible. 

(b) 7T-' = T~ l T = I. Then R = IR = (T~ l T)R = T~ l {TR) = T~ x l = T~ l . 

(c) Let V be the space of polynomials in t over K\ say, p(t) = a 0 + ap + a 2 t 2 + • ■ • + a s f. Let T and R be 
the operators on V defined by 

T(p(t)) = 0 + flj + a 2 t + • ■ • + a s f~ l and R(p(t)) = a Q t + ap 2 + • • • + a s f +l 
We have 

( TR)(p(t )) = T(R(p(t))) = T(a 0 t + ap 2 H-+ a s f +l ) = a 0 + o,f +-b a/ = p(t) 

and so TR = /, the identity mapping. On the other hand, if k G K and k ^ 0, then 

(RT)(k) = R(T(k)) = R( 0) = 0 yb k 

Accordingly, RT yb /, 

5.42. Let F and G be linear operators on R 2 defined by F(x, y) = (0, x) and G(x. y) = (x, 0). Show that 
(a) GF — 0, the zero mapping, but FG y7 0. (b) G 2 = G. 

(a) (GF)(x,y) = G(F(x,y)) = G(0,x) = (0,0). Because GF assigns 0 = (0,0) to every vector (x,y) in R 2 . it 
is the zero mapping; that is, GF = 0. 

On the other hand, (FG)(x,y) = F(G(x,y)) = F(x, 0) = (0,x). For example, (FG)( 2,3) = (0,2). 
Thus, FG yb 0, as it does not assign 0 = (0,0) to every vector in R 2 . 

(b) For any vector (x, y) in R 2 , we have G 2 (x,y) = G(G(x,y)) = G(x, 0) = (x, 0) = G(x,y). Hence, G 2 = G. 

5.43. Find the dimension of (a) A(R 4 ), (b) A(P 2 (t)), (c) A(M 2 3 ). 

Use dim[A(F)] = n 2 where dim V = n. Hence, (a) dim[A(R 4 )] = 4 2 = 16, (b) dim[A(P 2 (f))] = 3 2 = 9, 

(c) dim[A(M 2 3 )] = 6 2 = 36. 

5.44. Let E be a linear operator on V for which E 1 = E. (Such an operator is called a projection.) Let U 
be the image of E, and let IV be the kernel. Prove 

(a) If u e U. then E(u) = u (i.e., E is the identity mapping on U ). 

(b) If E yb /, then E is singular—that is, E(v) = 0 for some v y^ 0. 

(c) V =U ®W. 

(a) If u G U, the image of E, then E(v) = u for some v G V. Hence, using E 2 = E, we have 

u = E(v) = E 2 {v) = E{E{vj) = E(u) 

(b) If E yb /, then for some v G V. E(v) = u, where v y^ u. By (i), E(u) = u. Thus, 

E{v — u) = E{v) — E{u) = u — u = 0, where v — u yb 0 

(c) We first show that V = U + W. Let v G V. Set u = E{v) and w = v — E(v). Then 

v = E(v) + v — E(v) = u + w 

By definition, u = E(v) G U, the image of E. We now show that w G W, the kernel of E, 

E(w) = E(v - E(v)) = E{v) - E 2 (v ) = E{v) - E(v) = 0 
and thus w G W. Hence, V = U + W. 

We next show that U fl W = {0}. Let v G U fl W. Because v G U, E(v) = v by part (a). Because 
v € W, E(v) = 0. Thus, v = E(v) = 0 and so U fl W = {0}. 

The above two properties imply that V = U © W. 
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SUPPLEMENTARY PROBLEMS 


Mappings 

5.45. Determine the number of different mappings from (a) {1,2} into {1, 2,3}, ( b ) {1,2,...,/} into (1.2,..., s}. 

5.46. Let/:R^R and g : R —> R be defined by f(x) = x 2 + 3x + 1 and g(x) = 2x — 3. Find formulas defining 
the composition mappings: (a) f°g; (b) g °f; (c) g ° g\ ( d) f °f. 

5.47. For each mappings /: R —> R find a formula for its inverse: (a) f(x) = 3x — 7, (b) /( x) = x 3 + 2. 

5.48. For any mapping/ : A —> B, show that 1 B °f = f —f° 1 a- 

Linear Mappings 

5.49. Show that the following mappings are linear: 

(a) F: R 3 —*■ R 2 defined by F(x, y, z.) = (x + 2y — 3z, 4.x — 5y + 6z). 

(b) F: R 2 -> R 2 defined by F(x,y) = (ax + by, cx + dy), where a, b, c, d belong to R. 

5.50. Show that the following mappings are not linear: 

(a) F: R 2 —> R 2 defined by F(x,y) = (zc,y 2 ). 

(b) F : R 3 —> R 2 defined by F(x,y, z) = (x + 1, y + z). 

(c) F: R 2 —*■ R 2 defined by F(x,y) = (ry.y). 

(d) F : R 3 —> R- defined by F(x,y,z) = (|x|, y + z). 

5.51. Find F(a, b), where the linear map F: R 2 —> R 2 is defined by F(l. 2) = (3, — 1) and F( 0,1) = (2,1). 

5.52. Find a 2 x 2 matrix A that maps 

(a) (l,3) r and (l,4) r into (—2,5) r and (3,—l) r , respectively. 

(b) (2,— 4) t and (—1.2) r into (1, l) r and (l,3) r , respectively. 

5.53. Find a 2 x 2 singular matrix B that maps (1, l) r into (l,3) r . 

5.54. Let V be the vector space of real n-square matrices, and let M be a fixed nonzero matrix in V. Show that the 
first two of the following mappings T: V —> V are linear, but the third is not: 

(a) T(A) = MA , (b) T(A) = AM + MA , (c) T(A) = M + A. 

5.55. Give an example of a nonlinear map F: R 2 —> R 2 such that F 1 fO) = {0} but F is not one-to-one. 

5.56. Let F : R 2 —► R 2 be defined by F(x,y) = (3x + 5y, 2.x + 3y), and let S be the unit circle in R 2 . (S consists of 
all points satisfying x 2 + y 2 = 1.) Find (a) the image F(S), (b) the preimage F~ l (S). 

5.57. Consider the linear map G : R 3 —> R 3 defined by G(.x, y,z) = (x + y + z, y — 2 z, y — 3z) and the unit sphere 
S 2 in R 3 , which consists of the points satisfying x 2 + y 2 + z 2 = 1. Find (a) G(S 2 ), (b) G -1 (S 2 ). 

5.58. Let FI be the plane x + 2y — 3z = 4 in R 3 and let G be the linear map in Problem 5.57. Find 
(a) G(H), (b )G~ 1 (H). 

5.59. Let W be a subspace of V. The inclusion map, denoted by i: W » V, is defined by i(w) = w for every w G W. 
Show that the inclusion map is linear. 

5.60. Suppose F: V —> U is linear. Show that F(—v) = —F(v). 

Kernel and Image of Linear Mappings 

5.61. For each linear map F find a basis and the dimension of the kernel and the image of F: 

(a) F : R 3 —> R 3 defined by F(x, y, z) = (x + 2y - 3z, 2.x + 5y — 4z, x + 4y + z), 

(b) F: R 4 — > R 3 defined by F(x, y, z, t) = (x + 2y + 3z + 2 1, 2x + Ay + lz + 5 1 , x + 2y + 6z + 5f). 
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5.62. For each linear map G, find a basis and the dimension of the kernel and the image of G: 

(a) G: R 3 —> R 2 defined by G(x, y, z) = (x + y + z, 2x + 2y + 2 z), 

(b) G : R 3 —» R 2 defined by G(x, y, z) = (x + y, y + z), 

(c) G : R 5 —> R 3 defined by 

G(x, y, z , s, t) = (x + 2y + 2z + s + t, x+2y+2>z + 2s- t, 3x + 6y + 8z + 5s — f). 

5.63. Each of the following matrices determines a linear map from R 4 into R 3 : 


'1 

2 

0 

11 

1 

0 

2 

-1" 

2 

-1 

2 

1 

to 

II 

2 

3 

-1 

1 

1 

-3 

2 

-2j 

-2 

0 

-5 

3 


Find a basis as well as the dimension of the kernel and the image of each linear map. 

5.64. Find a linear mapping F: R 3 —> R 3 whose image is spanned by (1,2,3) and (4,5,6). 

5.65. Find a linear mapping G : R 4 —► R 3 whose kernel is spanned by (1,2,3,4) and (0,1,1,1). 

5.66. Let V = P l0 (f), the vector space of polynomials of degree < 10. Consider the linear map D 4 : V —> V, where 
D 4 denotes the fourth derivative d A {f)/dt A . Find a basis and the dimension of 

(a) the image of D 4 ; (b) the kernel of D 4 . 

5.67. Suppose F : V —> U is linear. Show that (a) the image of any subspace of V is a subspace of U ; 

(b) the preimage of any subspace of U is a subspace of V. 

5.68. Show that if F : V —> U is onto, then dim U < dim V. Determine all linear maps F : R 3 —> R 4 that are onto. 

5.69. Consider the zero mapping 0: V —> U defined by 0(i>) = 0, V v G V. Find the kernel and the image of 0. 

Operations with linear Mappings 

5.70. Let F : R 3 —> R 2 and G : R 3 —> R 2 be defined by F(x,y, z) = (y, x + z) and G(x,y,z) = (2 z, x — y). Find 
formulas defining the mappings F + G and 3 F — 2 G. 

5.71. Let FI: R 2 —> R 2 be defined by H(x, y) = (y, 2x). Using the maps F and G in Problem 5.70, find formulas 
defining the mappings: (a ) H°F and H°G, (b ) F°H and G°H, (c) H ° (F + G) and H°F + F[° G. 

5.72. Show that the following mappings F, G, FI are linearly independent: 

(a) F,G,H 6 Hom(R 2 ,R 2 ) defined by F(x,y) = (x,2y), G(x,y) = (y, x + y), H(x, y) = (0,x), 

(b) F, G,H e Hom(R 3 ,R) defined by F(x,y,z) = x + y + z, G(x,y,z) = y + z, H(x,y,z) = x - z. 

5.73. For F,G € Hom(V; U), show that rank(F + G) < rank(F) + rank(G). (Here V has finite dimension.) 

5.74. Let F: V —> U and G : U —> V be linear. Show that if F and G are nonsingular, then G°F is nonsingular. 
Give an example where G°F is nonsingular but G is not. [Hint: Let dim V < dim U.} 

5.75. Find the dimension d of (a) Hom(R 2 .R 8 ), (b) Hom(P 4 (f), R 3 ), (c) Hom(M 24 , P 2 (t)). 

5.76. Determine whether or not each of the following lineal' maps is nonsingular. If not, find a nonzero vector v 
whose image is 0; otherwise find a formula for the inverse map: 

(a) F: R 3 —*■ R 3 defined by F(x, y, z) = {x + y + z, 2x + 3y + 5 z, x + 3y + 7z), 

(b) G : R 3 —> P 2 (0 defined by G(x, y, z) = (x + y)t 2 + (x + 2y + 2 z)t + y + z, 

(c) H : R 2 —> P 2 (r) defined by H{x,y) = (x + 2 y)t 2 + (x — y)t + x + y. 

5.77. When can dim [Hom(V; U)] = dimF? 
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Algebra of Linear Operators 

5.78. Let F and G be the linear operators on R 2 defined by F(x,y) = (x + y, 0) and G(x,y) = (—y, x). Find 
formulas defining the linear operators: (a) F + G, (b) 5 F — 3 G, (c) FG, ( d ) GF, (e) F 2 , (/) G 2 . 

5.79. Show that each linear operator T on R 2 is nonsingular and find a formula for T~ l , where 

(a) T(x,y) = (x + 2y, 2x + 3y), (b) T(x,y) = (2x - 3y, 3x - 4v). 

5.80. Show that each of the following linear operators T on R 3 is nonsingulai' and find a formula for T~ x , where 

(a) T(x,y, z) = (x-3y- 2z, y - 4z, z ); (b) T(x,y,z) = (x+z, x- y, y). 

5.81. Find the dimension of A(V), where (a) V = R 7 . (b) V = P 5 (r), (c) V = M 3 4 . 

5.82. Which of the following integers can be the dimension of an algebra A(V) of linear maps: 

5, 9, 12, 25, 28, 36, 45, 64, 88, 100? 

5.83. Let T be the linear operator on R 2 defined by T(x, y) = (x + 2y, 3 jc + 4y). Find a formula for f(T), where 

(a) /(f) = t 2 + 2t — 3, (b)/(f) = t 2 — 5t — 2. 

Miscellaneous Problems 

5.84. Suppose F : V —> U is linear and £ is a nonzero scalar. Prove that the maps F and kF have the same kernel and 
the same image. 

5.85. Suppose F and G are linear operators on V and that F is nonsingular. Assume that V has finite dimension. 
Show that rank(FG) = rank(GF) = rank(G). 

5.86. Suppose V has finite dimension. Suppose T is a linear operator on V such that rank(7 2 ) = rank(r). Show that 
Ker T n Im T = {0}. 

5.87. Suppose V = U © W. Let and E 2 be the linear operators on V defined by E l (v) = u, E 2 (v) = w, where 

v = u + w, u € U, w G W. Show that (a) E 2 = E l and E\ = E 2 (i.e., that E x and E 2 are projections); 

(b) E x + E 2 = 7, the identity mapping; (c) E X E 2 = 0 and E 2 E X = 0. 

5.88. Let E x and E 2 be linear operators on V satisfying parts (a), (b), (c) of Problem 5.88. Prove 

V = Im E { G Im E 2 

5.89. Let v and w be elements of a real vector space V. The line segment L from v to v + w is defined to be the set of 
vectors u+fw for 0<r< 1. (See Fig. 5.6.) 

(a) Show that the line segment L between vectors v and u consists of the points: 

(i) (1 — t)v + tu for 0 < t < 1, (ii) t x v + t 2 u for t x + t 2 = 1, t x > 0, t 2 > 0. 

(b) Let F: V —* U be linear. Show that the image F(L) of a line segment L in V is a line segment in U. 


D + XV 



Figure 5-6 
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5.90. Let F: V —» U be linear and let W be a subspace of V. The restriction of F to W is the map F\W : W —> U 
defined by F\W(v) = F(v) for every v in W. Prove the following: 

(a) F\ W is linear; (b) Kcr(F\W) = (Ker F) n W; (c) Im(F|W) = F(W). 

5.91. A subset A of a vector space V is said to be convex if the line segment L between any two points (vectors) 
P, Q 6 X is contained in X. (a) Show that the intersection of convex sets is convex; (b) suppose F: V —» U is 
linear and X is convex. Show that F(X) is convex. 


ANSWERS TO SUPPLEMENTARY PROBLEMS 


5.45. (a) 3 2 = 9, (b) s r 

5.46. (a) (f°g)(x) = 4x 2 + 1, (b) (g°f)(x) = lx 2 + 6x - 1, (c) (g ° g) (x) = 4x - 9, 

(d) (f°f)(x) =x 4 + 6x 3 + 14x 2 + 15x+ 5 

5.47. (a)/- 1 (x) = i(x+7), (b) f~'(x) = 2 

5.49. F(x,y,z) = A(x,y,z) T , where (a) A = * J “g , (b) A = “ J 

5.50. (a) u = (2,2), k = 3; then F(ku) = (36,36) but kF(u) = (12,12); (b) F( 0) + 0; 

(c) u = (1,2), v = (3,4); then F(u + v) = (24,6) but F(u) + F(v) = (14, 6); 

(d) u = (1,2,3), k = —2; then F(ku) = (2, —10) but kF(u) = (—2, —10). 

5.51. F(a, b) = (—a + 2b, —3 a + b) 

r— 17 51 

5.52. (a) A = 2 ^ ^ ; (b) None. (2,-4) and (—1,2) are linearly dependent but not (1,1) and (1,3). 

5.53. B= ^ [J [Hint: Send (0, l) r into (0,0) r .] 

5.55. F(x,y) = (x 2 ,y 2 ) 

5.56. (a) 13x 2 - 42xy + 34v 2 = 1, (b) I3x 2 + 42xy + 34v 2 = 1 

5.57. (a) .r 2 - 8xy + 26y 2 + 6xz - 38yz + 14z 2 = 1, (b) x 2 + 2xy + 3y 2 + 2xz - 8yz + 14z 2 = 1 

5.58. (a) x — y + 2z = 4, (b) x + 6z = 4 

5.61. (a) dim(KerF) = 1, {(7,-2,1)}; dim(ImF) = 2, {(1,2,1), (0,1,2)}; 

(b) dim(Ker F) = 2, {(—2,1,0,0), (1,0,-1,1)}; dim(Im F) = 2, {(1,2,1), (0.1,3)} 

5.62. (a) dim(Ker G) = 2, {(1.0, — 1), (1, —1,0)}; dim(Im G) = 1, {(1,2)}; 

(b) dim(KerG) = 1, {(1,-1,1)}; Im G = R 2 , {(1,0), (0,1)}; 

(c) dim(KerG) = 3, {(-2,1,0,0,0), (1,0,-1,1,0), (-5,0,2,0,1)}; dim(Im G) = 2, 

{(1,1,3), (0,1,2)} 

5.63. (a) dim(KerA) =2, {(4,-2,-5,0), (1, -3,0,5)}; dim(Im A) = 2, {(1,2,1), (0,1,1)}; 

(b) dim(KerS) = l, {(—1,|, 1,1)}; Im B = R 3 

5.64. F(x, y, z) = (x + 4y, 2x + 5y, 3x + 6y) 
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5.65. F(x,y,z,t) = (x + y —z, 2x + y-t,0) 

5.66. (a) {1 (b) {1 ,M 2 ,f 3 } 

5.68. None, because dim R 4 > dimR 3 . 

5.69. Ker 0 = V, Im 0 = {0} 

5.70. (F + G)(x,y,z) = (y+ 2z, 2x — y + z), (3F - 2G)(x,y, z) = (3v - 4z, x + 2y + 3z) 

5.71. (a) (H°F)(x,y,z) = {x + z, 2 y), (H°G)(x,y,z) = {x —y, 4z); (b) not defined; 

(c) (H°(F + G))(x,y,z) = (H°F + H°G)(x,y,z ) = (2x - y + z, 2y + Az) 

5.74. F(x,y) = (x,y,y),G(x,y,z) = (x,y) 

5.75. (a) 16, (b) 15, (c) 24 

5.76. (a) v = (2, — 3,1); (b) G~ l (at 2 + bt + c) = (b — 2c, a — b + 2c, —a + b — c)\ 

(c) H is nonsingular, but not invertible, because dimP 2 (f) > dimR 2 . 

5.77. dim U = 1; that is, U = K. 

5.78. (a) (F + G)(x,y) = (x,x); (b) (5F - 3G)(x,y) = (5x + 8y, -3x); (c) (FG)(x,y) = (x — y, 0); 

(d) (GF)(x,y) = (0, x + y); (e) F 2 (x,y) = (x + y, 0) (note that F 2 = F); (/) G 2 ( x,y) = (-x, -y). 

[Note that G 2 + I = 0; hence, G is a zero of/(f) = t 2 + l.\ 

5.79. (a) T~ l (x,y) = (—3x + 2y, 2 x — y), (b) T~ x {x,y) = {-Ax+3y, -3x+2y) 

5.80. (a) T~ l (pc,y,z) = (jc+3y+ 14z, y - Az, z), (b ) T~ x (x,y,z) = {y + z, y, x-y-z ) 

5.81. (a) 49, (b) 36, (c) 144 

5.82. Squares: 9, 25, 36, 64, 100 

5.83. (a) T(x,y) = ( 6x+ 14v, 2bc + 21y); (b) T{x,y) = (0,0)—that is ,f(T) = 0 




Linear Mappings 
and Matrices 


6.1 Introduction 


Consider a basis S = {u l ,u 2 , ■ • ■, w,,} of a vector space V over a field K. For any vector v £ V, suppose 
v = api x + a 2 u 2 + ■ • • + a n u„ 

Then the coordinate vector of v relative to the basis S, which we assume to be a column vector (unless 
otherwise stated or implied), is denoted and defined by 

Ms= [ a ua 2 ,... 1 a„] T 

Recall (Section 4.11) that the mapping w—i>[v] s , determined by the basis S, is an isomorphism between V 
and K". 

This chapter shows that there is also an isomorphism, determined by the basis S, between the algebra 
A(V) of linear operators on V and the algebra M of //-square matrices over K. Thus, every linear mapping 
F: V —> V will correspond to an //-square matrix [F] s determined by the basis S. We will also show how 
our matrix representation changes when we choose another basis. 


6.2 Matrix Representation of a Linear Operator 


Let f be a linear operator (transformation) from a vector space V into itself, and suppose 
S — {u l ,u 2 ,..., u n } is a basis of V. Now r(w[), T(u 2 ),..., T(u n ) are vectors in V, and so each is a 
linear combination of the vectors in the basis S', say, 

T ( u i) = fl n M i +«i2“2 H- \~a u u n 

T(,U2) ~ a 2l u l + a 22 u 2 + ' ' ' + «2 n u n 


T M = a nl Ui + a n2 u 2 + ■ ■ • + ci, m u n 

The following definition applies. 

DEFINITION: The transpose of the above matrix of coefficients, denoted by m s (T ) or [7] 5 , is called 

the matrix representation of T relative to the basis S, or simply the matrix of T in the 
basis S. (The subscript S may be omitted if the basis S is understood.) 

Using the coordinate (column) vector notation, the matrix representation of T may be written in the 
form 

m s {T) - [T\ s = [[r(«t)] s , [T(u 2 )] s , ..., [T( Ul )) s ] 

That is, the columns of m(T) are the coordinate vectors of T(u t ), T(u 2 ),... ,T(u n ), respectively. 


197 





198 


CHAPTER 6 Linear Mappings and Matrices 


EXAMPLE 6.1 Let F: R 2 — R 2 be the linear operator defined by F(x, y) = (2x + 3y, 4x — 5 y). 

(a) Find the matrix representation of F relative to the basis S = {u l ,u 2 } = {(1,2), (2,5)}. 

(1) First find F(zy 1 ), and then write it as a linear combination of the basis vectors u 1 and u 2 . (For notational 
convenience, we use column vectors.) We have 


Solve the system to obtain x = 52, y = —22. Hence, F(u\) = 52«[ — 22 u 2 . 
(2) Next find F(u 2 ), and then write it as a linear combination of u j and u 2 : 


-6 


— x 


and 


x + 2y = 8 

2x + 5_y = —6 


F(u 2 ) = F 



19 

-17 



2 

5 


and 


x + 2y = 19 

2x + 5_y = —17 


Solve the system to get x = 129, y = —55. Thus, F(u 2 ) = 129 — 55 u 2 . 
Now write the coordinates of F(mj) and F(u 2 ) as columns to obtain the matrix 


[F\s = 


52 

-22 


129 

-55 


(b) Find the matrix representation of F relative to the (usual) basis E = {e t ,e 2 } = {(1,0), (0,1)}. 

Find F(e{) and write it as a linear combination of the usual basis vectors e x and e 2 , and then find F(e 2 ) and 
write it as a linear combination of e l and e 2 . We have 


F(e 1 )=F(l,0) = (2,2) =2 ei +4e, 

F(e 2 ) — F(0, 1) = (3, —5) = 3e x — 5e 2 


and so 


lf] E = 


2 

4 


3 

-5 


Note that the coordinates of F(e j) and F(e 2 ) form the columns, not the rows, of [F] £ . Also, note that the 
arithmetic is much simpler using the usual basis of R 2 . 


EXAMPLE 6.2 Let V be the vector space of functions with basis S = {sin t, cos t, e 3f }, and let D: V —» V be 
the differential operator defined by D(/(f)) = d(f(t))/dt. We compute the matrix representing D in the 
basis S: 


D(sinf) = cos t= 0(sint) + l(cost) + 0(e 3 f) 
D(cos t) = — sin t = — 1 (sin t) + 0(cos t) + 0(e if ) 
D(e 3 ') = 3e 3r = 0(sin t) + 0(cos t) + 3(e 3f ) 


and so 


[D] 


0-10 

1 0 0 

0 0 3 


Note that the coordinates of D(sinf), D(cost), D(e 3 ') form the columns, not the rows, of [D], 


Matrix Mappings and Their Matrix Representation 

Consider the following matrix A, which may be viewed as a linear operator on R 2 , and basis S of R 2 : 


A = 


3 

4 



and S = {«!, u 2 } 



2 

5 


(We write vectors as columns, because our map is a matrix.) We find the matrix representation of A 
relative to the basis S. 
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(1) First we write A(m 1 ) as a linear combination of u x and u 2 . We have 


A(u i) = 


'3 -2 

V 


'-1 " 


V 


'2 

_4 —5 _ 

2 

— 

-6 

= X 

2 

+ .v 

_5_ 


and so 


x + 2_y = — 1 
2x + 5y — —6 


Solving the system yields x — l,y— —4. Thus, A(u x ) = 7u x — 4u 2 . 
(2) Next we write A(m 2 ) as a linear combination of u x and u 2 . We have 


A(« 2 ) = 


'3 -2 

~2 


—4 


V 


~2 

_4 —5 _ 

_5_ 


-17 

= X 

2 

+ .V 

5 


and so 


x + 2y = -4 
2x+ 5y = —17 


Solving the system yields x = 14, y = —9. Thus, A(m 2 ) = 1 4u , — 9u 2 . Writing the coordinates of 
A(u x ) and A(w 2 ) as columns gives us the following matrix representation of A: 


W s = 


7 14 

-4 -9 


Remark: Suppose we want to find the matrix representation of A relative to the usual basis 
E = {e x ,e 2 j = {[l,0] r , [0, l] r } of R 2 . We have 


A(*i) = 

A(e 2 ) = 


'3 -2 

r 


4 -5 

0 


'3 -2 

'o' 


4 -5 

i 



3 

4 

-2 

-5 


= 3e l + 4e 2 
= -2e x - 5e 2 


and so 




3 -2 

4 -5 


Note that [A\ E is the original matrix A. This result is true in general: 


The matrix representation of any n x n square matrix A over a field K relative to the 
usual basis E of K n is the matrix A itself; that is, 

[A\e = A 


Algorithm for Finding Matrix Representations 

Next follows an algorithm for finding matrix representations. The first Step 0 is optional. It may be useful 
to use it in Step 1(b), which is repeated for each basis vector. 


ALGORITHM 6.1: The input is a linear operator T on a vector space V and a basis 
S — {«i. u 2 ...., u n } of V. The output is the matrix representation [7] v . 

Step 0. Find a formula for the coordinates of an arbitrary vector v relative to the basis S. 

Step 1. Repeat for each basis vector u k in S: 

(a) Find T(u k ). 

(b) Write T(u k ) as a linear combination of the basis vectors u x , u 2 ,..., u„. 

Step 2. Form the matrix [T] s whose columns are the coordinate vectors in Step 1(b). 

EXAMPLE 6.3 Let F: R 2 — R 2 be defined by F(x, y) = (2x + 3 y, 4x — 5y). Find the matrix representa¬ 
tion [F] s of F relative to the basis S = {u x ,u 2 } = {(1, —2), (2, —5)}. 

(Step 0) First find the coordinates of (a, b) G R 2 relative to the basis S. We have 


a 

b 


= x 



2 

-5 


or 


x +2y — a 
2x — 5 y = b 


or 


x+ 2 y = a 

—y—2a + b 
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Solving for x and y in terms of a and b yields x = 5a + 2b, y = —2a — b. Thus, 

(a, b) = (5 a + 2b)ui + (—2a — b)u 2 

(Step 1) Now we find F(iq) and write it as a linear combination of u l and a 2 using the above formula for (a, b), and 
then we repeat the process for F(u 2 ). We have 

F(m 1 ) = F( 1, -2) = (-4,14) = 8 ui - 6 u 2 
F(u 2 )=F( 2,-5) = (—11,33) = 11m, - lln 2 

(Step 2) Finally, we write the coordinates of F(zq) and F(u 2 ) as columns to obtain the required matrix: 


Properties of Matrix Representations 

This subsection gives the main properties of the matrix representations of linear operators T on a vector 
space V. We emphasize that we are always given a particular basis S of V. 

Our first theorem, proved in Problem 6.9, tells us that the “action” of a linear operator T on a vector v is 
preserved by its matrix representation. 

THEOREM 6.1: Let T:V^V be a linear operator, and let S be a (finite) basis of V. Then, for any 
vector v in V, [T] s [u] s = [T(v)] s . 

EXAMPLE 6.4 Consider the linear operator F on R 1 and the basis S of Example 6.3; that is, 

F(x,y) = (2x + 3y, 4x-5y) and S = {u l ,u 2 } = {( 1,-2), (2,-5)} 


Let 

v = (5,—7), and so F(v) = (— 11,55) 

Using the formula from Example 6.3, we get 

M - [U,-3] r and [F(v)} = [55, -33] r 
We verify Theorem 6.1 for this vector v (where [F] is obtained from Example 6.3): 


(F\M 


8 

11" 

' 11" 


55" 

-6 

-11 

-3 


-33 


[F(u)] 


Given a basis 5 of a vector space V, we have associated a matrix [T] to each linear operator T in the 
algebra A(V) of linear operators on V. Theorem 6.1 tells us that the “action” of an individual linear 
operator T is preserved by this representation. The next two theorems (proved in Problems 6.10 and 6.11) 
tell us that the three basic operations in A(V) with these operators—namely (i) addition, (ii) scalar 
multiplication, and (iii) composition—are also preserved. 


THEOREM 6.2: Let V be an /z-dimensional vector space over K, let S be a basis of V, and let M be the 
algebra of n x n matrices over K. Then the mapping 

m:A(V) —> M defined by m {T) — [T] s 

is a vector space isomorphism. That is, for any F,G6 A (V) and any k € K, 

(i) m(F +G)= m(F) + m(G) or [F + G] = [F] + [G] 

(ii) m(kF') = km(F ) or [kF\ = k[F] 

(iii) m is bijective (one-to-one and onto). 
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THEOREM 6.3: For any linear operators F, G £ A(V'), 

m(G o F) — m(G)m(F) or [Go F] — [G] [F] 

(Here Go F denotes the composition of the maps G and F.) 

6.3 Change of Basis 


Let V be an //-dimensional vector space over a field K. We have shown that once we have selected a basis 
S of V, every vector v G V can be represented by means of an //-tuple [n] s in K n , and every linear operator 
T in A(V) can be represented by an n x n matrix over K. We ask the following natural question: 

How do our representations change if we select another basis? 


In order to answer this question, we first need a definition. 

DEFINITION: Let S = {u ,, u 2 , . ■ ■, «„} be a basis of a vector space V, and let S' = { v , . v 2 ..... v n } be 

another basis. (For reference, we will call S the "old” basis and S' the “new” basis.) 
Because S is a basis, each vector in the “new” basis S' can be written uniquely as a 
linear combination of the vectors in 5; say, 

V\ — a n u l + a l2 u 2 + ■ ■ ■ + ai n u n 

v 2 = a 2l u l + a 22 u 2 + ' ' ' + a 2n u n 


v n — a nl u l + a n2 u 2 + ' ' ' + Ct„ n U n 

Let P be the transpose of the above matrix of coefficients; that is, let P = [pJ, where 
Pjj = cijj. Then P is called the change-of-basis matrix (or transition matrix ) from the 
“old” basis S to the “new” basis S'. 

The following remarks are in order. 

Remark 1: The above change-of-basis matrix P may also be viewed as the matrix whose columns 
are, respectively, the coordinate column vectors of the “new” basis vectors n, relative to the “old” basis 
S: namely, 

p = [h] 5 , his, • • •, his] 

Remark 2: Analogously, there is a change-of-basis matrix Q from the “new” basis S’ to the "old” 
basis S. Similarly, Q may be viewed as the matrix whose columns are, respectively, the coordinate column 
vectors of the “old” basis vectors n, relative to the "new” basis S': namely, 

Q ~ [his-, his', • • •, [ M nls'] 

Remark 3: Because the vectors v { ,v 2 ,... ,v n in the new basis S' are linearly independent, the 
matrix P is invertible (Problem 6.18). Similarly, Q is invertible. In fact, we have the following proposition 
(proved in Problem 6.18). 

PROPOSITION 6.4: Let P and Q be the above change-of-basis matrices. Then Q — P 1 . 

Now suppose S— {n 1; u 2 ,... r u n } is a basis of a vector space V, and suppose P= [pJ is any 
nonsingular matrix. Then the n vectors 

Vi = p u Uj + p 2i u 2 H - b p ni u n , i — 1,2,... ,n 

corresponding to the columns of P, are linearly independent [Problem 6.21(a)]. Thus, they form another 
basis S' of V. Moreover, P will be the change-of-basis matrix from S to the new basis S'. 
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EXAMPLE 6.5 Consider the following two bases of R 2 : 

S= {i<i,k 2 } = {(1,2), (3,5)} and S' = {v u v 2 } = {(1,-1), (1, -2)} 

(a) Find the change-of-basis matrix P from S to the “new” basis S'. 

Write each of the new basis vectors of S' as a linear combination of the original basis vectors u l and u 2 of S. 
We have 



or 


or 


x + 3 v = 1 
2x + 5y = —2 

x + 3.y = 1 
2x + 5y = — 1 


yielding x = —8, y = 3 
yielding x = —11, y = 4 


Thus, 

-8 - 11 " 
3 4 


Vi = —8 u l + 3 u 2 
V 2 = — 11 Mr + 4 Mi 


and hence, 


P = 


Note that the coordinates of rq and v 2 are the columns, not rows, of the change-of-basis matrix P. 

(b) Find the change-of-basis matrix Q from the “new” basis S' back to the “old” basis S. 

Flere we write each of the “old” basis vectors u l and u 2 of S' as a linear combination of the “new" basis 
vectors v l and v 2 of S'. This yields 


Ui = 4v t — 3 v 2 
u 2 = llUj — 8u 2 


and hence, 



11 

-8 


As expected from Proposition 6.4, Q = P 1 . (In fact, we could have obtained Q by simply finding P 1 .) 


EXAMPLE 6.6 Consider the following two bases of R 3 : 


and 


E={ ei ,e 2 ,e 3 } = {(1,0,0), (0,1,0), (0,0,1)} 
S={u u u 2 ,u 3 } = {( 1,0,1), (2,1,2), (1,2,2)} 


(a) Find the change-of-basis matrix P from the basis E to the basis S. 

Because E is the usual basis, we can immediately write each basis element of S as a linear combination of the 
basis elements of E. Specifically, 


u | — (1,0,1) — c | -{- 0 T- e 3 
u 2 = (2,1,2) = 2e 1 + e 2 + 2e 3 
« 3 = (1,2,2)= e I +2e 2 + 2e 3 


and hence, 


P = 


1 2 
0 1 
1 2 


1 

2 

2 


Again, the coordinates of u 2 , u 2 . u 2 appear as the columns in P. Observe that P is simply the matrix whose 
columns are the basis vectors of S. This is true only because the original basis was the usual basis E. 

(b) Find the change-of-basis matrix Q from the basis S to the basis E. 

The definition of the change-of-basis matrix Q tells us to write each of the (usual) basis vectors in E as a 
linear combination of the basis elements of S. This yields 


e l = (1,0,0) = —2 + 2u 2 — u 2 

e 2 = (0, 1,0) = — 2u x + u 2 

e 3 = (0,0,1) = 3 ii\ — 2 u 2 “t~ u 3 


and hence, 



-2 3 

1 -2 
0 1 


We emphasize that to find Q, we need to solve three 3x3 systems of linear equations—one 3x3 system for 
each of e 1; e 2 , e 3 . 
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Alternatively, we can find Q = P 1 by forming the matrix M = [P, 7] and row reducing M to row canonical 


form: 



“1 

2 

1 

1 

0 

o' 


'1 

0 

0 

-2 

-2 

3' 

M = 

0 

1 

2 

0 

1 

0 

r-j 

0 

1 

0 

2 

1 

-2 


1 

2 

2 

0 

0 

1 


0 

0 

1 

-1 

0 

1 


thus, 


Q = P = 


-2 -2 

2 1 

-1 0 


3 

-2 

1 




(Here we have used the fact that Q is the inverse of P.) 

The result in Example 6.6(a) is true in general. We state this result formally, because it occurs often. 

PROPOSITION 6.5: The change-of-basis matrix from the usual basis E of K n to any basis S of K" is 
the matrix P whose columns are, respectively, the basis vectors of S. 


Applications of Change-of-Basis Matrix 

First we show how a change of basis affects the coordinates of a vector in a vector space V. The following 
theorem is proved in Problem 6.22. 


THEOREM 6.6: Let P be the change-of-basis matrix from a basis S to a basis S' in a vector space V. 
Then, for any vector v € V, we have 

P[u] s , = [v] s and hence, P _1 [n] s = [v] s , 

Namely, if we multiply the coordinates of v in the original basis S by P l , we get the coordinates of v 
in the new basis S'. 


Remark 1: Although P is called the change-of-basis matrix from the old basis S to the new basis S', 
we emphasize that P 1 transforms the coordinates of v in the original basis S into the coordinates of v in 
the new basis S'. 


Remark 2: Because of the above theorem, many texts call Q = P 1 , not P, the transition matrix 
from the old basis S to the new basis S'. Some texts also refer to Q as the chcinge-of-coordinates matrix. 


We now give the proof of the above theorem for the special case that dim V — 3. Suppose P is the 
change-of-basis matrix from the basis S = {u Xl u 2 , n 3 } to the basis S' = { v ,, v 2 ■ }; say, 


Vi = + a 2 u 2 + a 2 a 2 

v 2 = b l u i + b 2 u 2 + b 2 u 2 

V 2 = Cj U | + C 2 U 2 + C3W3 


and hence, 



o 3 


b 2 

b 3 


c 1 

Cl 

Cl 


Now suppose v € V and, say, v = k l v l + k 2 v 2 + k 2 v 2 . Then, substituting for V\ , v 2 , v 2 from above, we 
obtain 


v — ky(^G\iii T q 2 u 2 T q^u-^ T k 2 {b x U\ T b 2 u 2 T 63 ^/ 3 ) T k 2 (c x u x T c 2 u 2 T c 2 u 2 ') 
— (r/jA'i -f b x k 2 T c x k. 2 ^u x T (yCi^k x T b 2 k 2 T Cok-^u^ T {^ci 2 k x -t- b 2 k 2 T c 2 k 2 ^ju 2 
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Thus, 

pi] ["«Mi + b x k 2 + c x k 3 

[w]y = k 2 and [w] s = a 2 k x + MM + C 2 M 
k 3 ci 3 k x T b 3 k 2 T c 3 k 3 

Accordingly, 

ci\ b x c x k x ct l k l + b x k 2 + c x k 3 

p [ v ]s' = a 2 M ci h = a 2 k\ + b 2 k 2 + c 2 k 3 = [u] s 

n 3 b 3 c 3 k 3 a 2 k l + b 3 k 2 + c 3 k 3 

Finally, muldplying the equation [v] s = P[u] s , by P 1 , we get 

P Ms = p ^My = 'My = My 

The next theorem (proved in Problem 6.26) shows how a change of basis affects the matrix 
representation of a linear operator. 

THEOREM 6.7: Let P be the change-of-basis matrix from a basis S to a basis S' in a vector space V. 
Then, for any linear operator T on V, 

[My =P-\T] S P 

That is, if A and B are the matrix representations of T relative, respectively, to S and 
S', then 

B = P- 1 AP 

EXAMPLE 6.7 Consider the following two bases of R 3 : 

E — { e i>e 2 ,e 3 } — {(1,0,0), (0,1,0), (0,0,1)} 

and S= {u u u 2 ,u 3 } = {(1,0,1), (2,1,2), (1,2,2)} 


The change-of-basis matrix P from E to S and its inverse P 1 were obtained in Example 6.6. 

(a) Write v = (1,3,5) as a linear combination of u x , u 2 , n 3 , or, equivalently, find [t)] 5 . 

One way to do this is to directly solve the vector equation v = xu x + yu 2 + zm 3 ; that is, 


n ri] [2] n 

3 =*0 + y 1 + z 2 

5 12 2 


x + 2y + z — l 
y + 2z — 3 
x T 2y T 2z — 5 


The solution is x = 7, y = —5, z = 4, so r= — 5 m 2 + 4k 3 . 

On the other hand, we know that [d] £ = [1, 3, 5] r , because E is the usual basis, and we already know P -1 . 
Therefore, by Theorem 6.6, 



Thus, again, v = lu x — 5u 2 + 4m 3 . 

'1 3 -2' 

(b) LetA= 2—4 1 , which may be viewed as a linear operator on R 3 . Find the matrix B that represents A 

3 -1 2 J 


relative to the basis S. 
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The definition of the matrix representation of A relative to the basis S tells us to write each of A(iq), A(m 2 ), 
A(m 3 ) as a linear combination of the basis vectors « 1 ,n 2 ,n 3 of S. This yields 


A(u j) = ( — 1,3,5) = 11 u l — 9 u 2 + 6 m 3 
A(u 2 ) = (1,2,9) = 21 Mj — 14 u 2 + 8 n 3 

A(m 3 ) = (3, —4,5) = 17m, — 8e 2 + 2« 3 


and hence, 5 = 


11 

21 

17 

-9 

-14 

-8 

6 

8 

2 


We emphasize that to find B, we need to solve three 3x3 systems of linear equations—one 3x3 system for 
each of A(« 1 ), A(k 2 ), A(k 3 ). 

On the other hand, because we know P and P _1 , we can use Theorem 6.7. That is, 


p = P~‘AP = 


-2 -2 3 

2 1 -2 

-1 0 1 


3 

-4 

-1 



'1 

2 

r 



0 

1 

2 

= 


1 

2 

2 



11 21 

-9 -14 


17 


This, as expected, gives the same result. 


6.4 Similarity 


Suppose A and B are square matrices for which there exists an invertible matrix P such that B = P 1 AP; 
then B is said to be similar to A, or B is said to be obtained from A by a similarity transformation. We show 
(Problem 6.29) that similarity of matrices is an equivalence relation. 

By Theorem 6.7 and the above remark, we have the following basic result. 

THEOREM 6.8: Two matrices represent the same linear operator if and only if the matrices are similar. 

That is, all the matrix representations of a linear operator T form an equivalence class of similar 
matrices. 

A linear operator T is said to be diagonalizable if there exists a basis S of V such that T is represented 
by a diagonal matrix; the basis S is then said to diagonalize T. The preceding theorem gives us the 
following result. 

THEOREM 6.9: Let A be the matrix representation of a linear operator T. Then T is diagonalizable 
if and only if there exists an invertible matrix P such that P 1 AP is a diagonal matrix. 

That is, T is diagonalizable if and only if its matrix representation can be diagonalized by a similarity 
transformation. 

We emphasize that not every operator is diagonalizable. However, we will show (Chapter 10) that every 
linear operator can be represented by certain “standard” matrices called its normal or canonical forms. 
Such a discussion will require some theory of fields, polynomials, and determinants. 

Functions and Similar Matrices 

Suppose f is a function on square matrices that assigns the same value to similar matrices; that is, 
/(A) —f(B) whenever A is similar to B. Then/ induces a function, also denoted by/, on linear operators T 
in the following natural way. We define 

f(T) =f([T]s) 

where S is any basis. By Theorem 6.8, the function is well defined. 

The determinant (Chapter 8) is perhaps the most important example of such a function. The trace 
(Section 2.7) is another important example of such a function. 



206 


CHAPTER 6 Linear Mappings and Matrices 


EXAMPLE 6.8 Consider the following linear operator F and bases E and S of R 2 : 

F(x,y) = (2x + 3y, 4x-5y), E = {(1,0), (0,1)}, S = {(1,2), (2,5)} 


By Example 6.1, the matrix representations of F relative to the bases E and S are, respectively, 


A = 


2 

4 


3 

-5 


and 



129 

-55 


Using matrix A, we have 

(i) Determinant of F = det(A) = —10 — 12 = —22; (ii) Trace of F = tr(A) = 2 — 5 = —3. 

On the other hand, using matrix B, we have 

(i) Determinant of F = det(B) = —2860 + 2838 = —22; (ii) Trace of F = tr(fi) = 52 — 55 = —3. 
As expected, both matrices yield the same result. 


6.5 Matrices and General Linear Mappings 


Last, we consider the general case of linear mappings from one vector space into another. Suppose V and U 
are vector spaces over the same field K and, say, dim V = m and dim U = n. Furthermore, suppose 

S={v u v 2 ,...,v m } and S' = {u l: u 2 , ■ ■ ■ ,u„} 

are arbitrary but fixed bases, respectively, of V and U. 

Suppose F: V —> U is a linear mapping. Then the vectors F(v t ), F(v 2 ), ■■■, F(v m ) belong to U. and 
so each is a linear combination of the basis vectors in 5"; say, 

F M = a n ui + a n u 2 + • • • + a ht u n 
f (v 2 ) = a 21 u l + a 22 u 2 + • • • + a 2n u n 


F (vJ = a ml u { + a m 2 u 2 + ■ • • + a mn u n 

DEFINITION: The transpose of the above matrix of coefficients, denoted by m SS r(F ) or [F] s y , is 

called the matrix representation of F relative to the bases S and S'. [We will use the 
simple notation m(F) and [F] when the bases are understood.] 

The following theorem is analogous to Theorem 6.1 for linear operators (Problem 6.67). 

THEOREM 6.10: For any vector v € V, s ,[r] s = [F(v)] s ,. 

That is, multiplying the coordinates of v in the basis S of V by [F], we obtain the coordinates of F(v) in 
the basis S' of U. 

Recall that for any vector spaces V and U, the collection of all linear mappings from V into U is a vector 
space and is denoted by Hom(V, JJ). The following theorem is analogous to Theorem 6.2 for linear 
operators, where now we let M = M m n denote the vector space of all m x n matrices (Problem 6.67). 

THEOREM 6.11: The mapping m: HonifV'. JJ) —» M defined by m(F) = [F] is a vector space isomorph¬ 
ism. That is, for any F, G € Hom( V, JJ) and any scalar k, 

(i) m(F +G)= m(F) + m(G ) or [F + G] = [F] + [G] 

(ii) m(kF) = km(F) or [kF] = k[F] 

(iii) m is bijective (one-to-one and onto). 
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Our next theorem is analogous to Theorem 6.3 for linear operators (Problem 6.67). 

THEOREM 6.12: Let S,S',S" be bases of vector spaces V , U, W, respectively. Let F:V —/ U and 
G o U —» W be linear mappings. Then 

[G° F] s , s „ = [G]y jS „[F] 5 S , 

That is, relative to the appropriate bases, the matrix representation of the composition of two mappings 
is the matrix product of the matrix representations of the individual mappings. 

Next we show how the matrix representation of a linear mapping F:V^Uis affected when new bases 
are selected (Problem 6.67). 

THEOREM 6.13: Let P be the change-of-basis matrix from a basis e to a basis e in V, and let Q be 
the change-of-basis matrix from a basis / to a basis /' in U. Then, for any linear 
map F:V^U, 

[Fl,f = Q- l [F} e jP 

In other words, if A is the matrix representation of a linear mapping F relative to the bases e and/, and 
B is the matrix representation of F relative to the bases e' and/', then 

B = QT X AP 


Our last theorem, proved in Problem 6.36, shows that any linear mapping from one vector space V 
into another vector space U can be represented by a very simple matrix. We note that this theorem is 
analogous to Theorem 3.18 for m x n matrices. 


THEOREM 6.14: Let F:V^U be linear and, say, rank(F) = r. Then there exist bases of V and U 
such that the matrix representation of F has the form 


A = 


Ir 

0 


0 

0 


where /,. is the /--square identity matrix. 


The above matrix A is called the normal or canonical form of the linear map F. 


SOLVED PROBLEMS 


Matrix Representation of Linear Operators 

6.1. Consider the linear mapping F:R 2 ^R 2 defined by F[x, y) = (3x + 4_v, 2x — 5y) and the 
following bases of R 2 : 

E={e u e 2 } = {{\,0), (0,1)} and S = {u u u 2 } = {(1,2), (2,3)} 

(a) Find the matrix A representing F relative to the basis E. 

(b) Find the matrix B representing F relative to the basis S. 

(a) Because E is the usual basis, the rows of A are simply the coefficients in the components of F(x, y) ; that 
is, using (a, b) = ae x + be 2 , we have 

F( ei )=F(l,0) = (3,2) =3e l+ 2e 2 \3 4 

F(e 2 ) = F(0,1) = (4, —5) = 4e, — 5e 2 aM A [2 -5 

Note that the coefficients of the basis vectors are written as columns in the matrix representation. 
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(b) First find F(u { ) and write it as a linear combination of the basis vectors «j and u 2 ■ We have 
F(« 1 ) = F(F2) = (ll,-8)=.r(l,2)+y(2,3), and so ^ + + Z=-\ 
Solve the system to obtain x = —49, y = 30. Therefore, 

F(m,) = —49u x + 30m 2 


Next find F(u 2 ) and write it as a linear combination of the basis vectors u i and u 2 . We have 

x + 2y = 18 

2x + 3y = — 11 


F(u 2 ) = F( 2, 3) = (18,-11) = x(l, 2) +y( 2, 3), and so 


Solve for x and y to obtain x = —76, y = 47. Hence, 

F(u 2 ) = —16u x + 47m 2 


Write the coefficients of u l and u 2 as columns to obtain B = 


-49 -76 
30 47 


(b') Alternatively, one can first find the coordinates of an arbitrary vector (a, b) in R 2 relative to the basis S. 
We have 


(a, b) = x(l,2) +y(2,3) = (x + 2y, 2x + 3 y), 


and so 


x + 2y = a 
2x + 3 y = b 


Solve for x and y in terms of a and b to get x = — 3 a + 2b, y = 2a — b. Thus, 

( a , b) = (— 3 a + 2 b)u l + ( 2 a — b)u 2 

Then use the formula for ( a,b ) to find the coordinates of F(u { ) and F(u 2 ) relative to S: 

F{u x ) =F(1,2) = (11,-8) =-49« 1+ 30n 2 n = \~ A9 “ 76 

F(u 2 ) =F{2, 3) = (18,-11) = -76«! +47m 2 |_ 30 47 


6.2. Consider the following linear operator G on R 2 and basis S: 

G(x,y) = (2x — 7y, 4x + 3y) and S = {u u u 2 } = {(1,3), (2,5)} 

(a) Find the matrix representation [G] s of G relative to S. 

(b) Verify [G] s [v] 5 = [G(n)] s for the vector v = (4, —3) in R 2 . 

First find the coordinates of an arbitrary vector v = (a, b) in R 2 relative to the basis S. 
We have 


a 

b 



and so 


x + 2y = a 
3x + 5 y — b 


Solve for x and y in terms of a and b to get x— —5a + 2b, y = 3a — b. Thus, 

(a, b) — (—5 a + 2b)u x + (3a — b)u 2 , and so [it] = [—5a + 2b, 3 a — b] T 

(a) Using the formula for (a, b) and G{x,y) = (2x — 7y , 4x + 3y), we have 

G(« 1 ) = G(1,3) = (-19,13)= 121« 1 -70« 2 frl = [ 121 201 ’ 

G(m 2 )=G(2,5) = (-31,23) = 201« l -116m 2 [ls [-70 -116 

(We emphasize that the coefficients of m, and u 2 are written as columns, not rows, in the matrix representation.) 

(b) Use the formula (a, b) = (—5a + 2 b)u l + (3a — b)u 2 to get 

v = (4, —3) = — 26u l + 15w 2 
G{v) = G{ 4,-3) = (20,7) = -131m 1 +80m 2 

[it] s = [-26,15] r and [G(u)] s = [-131,80] r 


Then 
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Accordingly, 


m r i _ [ 121 201] [ -26 

[G]sHs-:L_ 7 0 — 116 J y 15 


‘sol = [ G (^l^ 


(This is expected from Theorem 6.1.) 


6.3. Consider the following 2x2 matrix A and basis S of R": 


S= {u u u 2 } = 


The matrix A defines a linear operator on R. Find the matrix B that represents the mapping A 
relative to the basis S. 

First find the coordinates of an arbitrary vector {a,b) T with respect to the basis S. We have 
a 1, 3 x + 3y = a 

b\ +3 1“ 7 J ° r -1x-ly = b 

Solve for x and y in terms of a and b to obtain x = la + 3b, y = —2a — b. Thus, 

{a, b) T = {la + 3b)ui + (—2 a — b)u 2 

Then use the formula for ( a , b) T to find the coordinates of Au { and Am 2 relative to the basis S: 

A«i=[! t]\ II = =-63« 1 + 19« 2 


5 6 -2 


Au 2 = 


5 6 -7 


= —235m! + 71m 2 


Writing the coordinates as columns yields 


-63 -235 
19 71 


6.4. Find the matrix representation of each of the following linear operators F on R 3 relative to the usual 
basis E = {e u e 2 ,e 3 } of R 3 ; that is, find [F] — [F] E : 

(a) F defined by F(x, y, z) = (x + 2_y - 3 z, 4x - 5y - 6 z, lx + 8y + 9 z). 

"i i r 

(b) F defined by the 3x3 matrix A = 2 3 4. 

_ 5 5 5_ 

(c) F defined by F{e x ) = (l,3,5),F(e 2 ) = (2,4,6), F(e 3 ) = (7,7,7). (Theorem 5.2 states that a 
linear map is completely defined by its action on the vectors in a basis.) 


(a) Because E is the usual basis, simply write the coefficients of the components of F(x,y,z) as rows: 

'1 2 -3' 

[F] = 4 -5 -6 

7 8 9 

(b) Because E is the usual basis, [F] = A, the matrix A itself. 

(c) Here 

F{e x ) = (1,3,5) = e 1 + 3e 2 + 5e 3 [12 7' 

F(g 2 ) = (2,4,6) = 2e l + 4e 2 + 6e 3 and so [F] = 3 4 7 

F(e 3 ) = (7,7,7) = 7e 1 + 7e 2 + 7e 3 [5 6 7_ 

That is, the columns of [F] are the images of the usual basis vectors. 

6.5. Let G be the linear operator on R 3 defined by G(x,y,z ) = (2y + z, x— Ay, 3x). 

(a) Find the matrix representation of G relative to the basis 

■S={w 1 ,w 2 ,w 3 } = {(1,1,1), (1,1,0), (1,0,0)} 

(b) Verify that [G][u] = [G(v)] for any vector v in R 3 . 
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6 . 6 . 


First find the coordinates of an arbitrary vector (a, b, c) e R' with respect to the basis S. Write (a, b, c) as a 
linear combination of w l ,w 2 ,w 3 using unknown scalars x,y, and z: 

(a, b, c) = x(l, 1 , 1 ) + 7 ( 1 , 1 , 0 ) +z(l, 0 , 0 ) = {x +y + z, x + y, x ) 

Set corresponding components equal to each other to obtain the system of equations 

x + y + z = a, x + y = b, x = c 

Solve the system for x, y, z in terms of a, b, c to find x = c, y = b — c, z = a — b. Thus, 

(a, b, c) = cvtq + (b — c)w 2 + (a — b)w 3 , or equivalently, [(a, b, c)] = [c, b — c, a — b\ r 

(a) Because G(x,y,z) = (2y + z, x — 4y, 3x), 


G( Wl ) = G(l, 1,1) = (3, —3,3) = 3w 1 — 6x 2 + 6x 3 
G(w 2 ) = G(l, 1,0) = (2, —3,3) = 3w 1 — 6 w 2 + 5w 3 
G(w 3 ) = G(l, 0,0) = (0,1,3) = 3w l - 2 w 2 - w 3 

Write the coordinates Gfyvx), G(w 2 ), G(w 3 ) as columns to get 


[G] = 


3 3 3 

-6 -6 -2 
6 5-1 


(b) Write G(v) as a linear combination of vv,, vv 2 , vv 3 , where v = ( a,b,c ) is an arbitrary vector in R 3 , 
G(v) = G(a, b, c) = (2b + c, a — 4b , 3a) = 3awj + (—2 a — 4b)w 2 + (—a + 6 b + c)w 3 

or equivalently, 

[G(v)] = [3a, —2a — 4b, —a + 6b + c] T 


Accordingly, 


3 3 3' 


c 


3a 

-6 -6 -2 


b — c 

- 

-2a - 4b 

6 5-1 


a — b 


—a + 6b + c 


[G][u] = 

Consider the following 3x3 matrix A and basis S of R 3 : 

1 -2 1 

A= I 3 —1 ()| and S — {u^, u 2 , m 3 } = 

14-2 


= [G(«)l 


The matrix A defines a linear operator on R 3 . Find the matrix B that represents the mapping A 
relative to the basis S. (Recall that A represents itself relative to the usual basis of R 3 .) 

First find the coordinates of an arbitrary vector (a, b, c) in R 3 with respect to the basis S. We have 


x+ z = a 
or x + y +2z = b 
x + y+3z=c 


a 


1 


0 


"1 

b 

— X 

1 

+ T 

1 

+ Z 

2 

c 


1 


1 


3 


Solve for x, y, z in terms of a,b,c to get 

x = a + b — c, y = —a + 2 b — c, z = c — b 
thus, (a, b, c) T = (a + b — c)iq + (—a + 2b — c)u 2 + (c — b)u 3 
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Then use the formula for (n,£>, c) T to find the coordinates of Au l , Au 2 , Au 3 relative to the basis S: 


A(iq) = A(l, 1, l) r = (0,2,3) r = —u l + u 2 + «3 

A(u 2 )=a\ 1.1.0) r = ( -1. 1.2)' = —4u x — 3u 2 + 3m 3 

A(w 3 ) = A(l, 2,3) r = (0,1,3) r = —2 u x — m 2 + 2m 3 


so 


B = 


-1 

1 

1 


-4 

-3 

3 


-2 

-1 

2 


6.7. 


For each of the following linear transformations (operators) L on R 2 , find the matrix A that 
represents L (relative to the usual basis of R 2 ): 


(a) L is defined by L(1,0) = (2,4) and L(0,1) = (5,8). 

(b) L is the rotation in R 2 counterclockwise by 90°. 

(c) L is the reflection in R 2 about the line y — —x. 


(a) Because {(1,0), (0,1)} is the usual basis of R 2 , write their images under L as columns to get 


(b) Under the rotation L , we have L( 1,0) 


(c) Under the reflection L , we have L( 1,0) 


A = 


2 5 
4 8 


0 -1 

1 0 


(0,1) and L(0, 1) = (— 
A = 

= (0,-1) and L(0,1) = 

A = I 0 - 1 

A 1 -1 0 


1,0). Thus, 


(—1,0). Thus, 


6.8. The set S = {e 3t , te 3t , t 2 e 3 '} is a basis of a vector space V of functions/:R R. Let D be the 
differential operator on V: that is, D(/) = df /dt. Find the matrix representation of D relative to the 
basis S. 

Find the image of each basis function: 


D(e 3f ) = 3e 3f = 3(e 3 ') + 0 (te 3t ) + O(tV') 

D(te 3 ') = e 3 '+ 3 te 3 ' = l(e 3t ) + 3(te 3t ) + 0(t 2 e 3t ) and thus, [D] 

D(? 2 e 3 ') = 2 te 3 ' + 3 t 2 e 3t = 0(e 3 ') + 2 (te 3r ) + 3 (t 2 e 3t ) 


3 1 0 
0 3 2 
0 0 3 


6.9. 


Prove Theorem 6.1: Let T:V ^ V be a linear operator, and let S be a (finite) basis of V. Then, for 
any vector v in V, [T] s [v] s = [TX^s- 

Suppose S = {w |, u 2 ,..., u n }, and suppose, for i = 1,..., n, 

n 

T( Uj ) = a n u l + a i2 u 2 + • • • + a in ii„ = a^Uj 

i= i 


Then [Tj^ is the n-square matrix whose /th row is 


(a v ,a 2j ,...,a nj ) 


( 1 ) 


Now suppose 

n 

V — k\li\ + ^ 2^2 + • • • + kn^n ~ 

i'=l 


Writing a column vector as the transpose of a row vector, we have 

Ms = ■ ■ ■ ,k„\ T 


( 2 ) 
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Furthermore, using the linearity of T, 


T( v) = T 



n n / n 

E ki T (ui) = EM E 
(=i i= 1 Vj=i 



n 


= M§^. 


«2 = E (OlA + *22*2 + • • • + 

;=i 


Thus, [F(i;)] 5 is the column vector whose fth entry is 

+ 0-2^2 +-F (3) 

On the other hand, the /th entry of [T] s [t;] s is obtained by multiplying the /th row of [F] s by [d] s —that is 
(1) by (2). But the product of (1) and (2) is (3). Hence, [T] 5 [t;] x and [F(v)] 5 have the same entries. Thus, 

[T} s [v\s = [T(t;)] s . 


6.10. Prove Theorem 6.2: Let S — {u 1 ,u 2 , ■ ■ ■, u„} be a basis for V over K, and let M be the algebra of n- 
square matrices over K. Then the mapping m:A(V) —> M defined by m(T) — [F] v is a vector space 
isomorphism. That is, for any F, G S A(V) and any k € K, we have 

(i) [F + G] = [F] + [G], (ii) [kF] — k[F], (iii) m is one-to-one and onto, 

fi) Suppose, for i = 1,..., n, 

n n 

F(iii) = J2 a ij u j and G(n,) = X! k ijUj 

i=i 2=i 

Consider the matrices A = [a (/ ] and B = [bjj]. Then [F] = A T and [G] = B T . We have, for i = 1 

( F + G)i«i) = F< y u i ) + G(m,) = E K + b ij) u j 

2=1 

Because A + B is the matrix (a^ + by), we have 

[F + G] = (A + B) T =A t + B t = [F] + [G] 

(ii) Also, for i = l,... ,n, 


(kF) (u t ) = kF(uj) = a ijUj = E(K ') m 2 

2=1 2=1 

Because M is the matrix (. ka t j), we have 

[kF] = (M) r = kA r = k[F} 

(iii) Finally, m is one-to-one, because a linear mapping is completely determined by its values on a basis. 
Also, m is onto, because matrix A = [a t] \ in M is the image of the linear operator. 


F ( u i) = E a ii u ii ( = 


2=1 


Thus, the theorem is proved. 


6.11. Prove Theorem 6.3: For any linear operators G,F € A(V), [Go F] = [G][F]. 
Using the notation in Problem 6.10, we have 

(G o F)(uj) = G(F(uj)) = G^EflyU^ = E a ijG(uj) 

= E (E b Jk u k) = E ( E aifijk) u k 

2=1 \A=1 / k= 1 V 7=1 / 

Recall that AB is the matrix AB = [c ft ], where c ik = E"=i a ijbjk- Accordingly, 

[G o F] = (AB) t = B t A t = [G][F] 

The theorem is proved. 
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6.12. Let A be the matrix representation of a linear operator T. Prove that, for any polynomial/(t), we 
have that f(A ) is the matrix representation of f(T). [Thus, f(T) = 0 if and only if/(A) = 0.] 

Let (f> be the mapping that sends an operator T into its matrix representation A. We need to prove that 
=f(A). Suppose/(f) = a n t" + • • • + a^t + a 0 . The proof is by induction on n, the degree of f(t). 

Suppose n = 0. Recall that </>(/') = /, where I' is the identity mapping and I is the identity matrix. Thus, 

<M/(T)) = </>(«(/) = a Q (t>{I') = a 0 I =f(A) 
and so the theorem holds for n = 0. 

Now assume the theorem holds for polynomials of degree less than n. Then, because cj) is an algebra 
isomorphism, 

= 4>( a nT" + a n _{T n ~ l + ... + a x T + a 0 l') 

= a„4>{T)(j){T n - x ) + </>(«„_, r- 1 + --.+UJ + a 0 I') 

= a n AA" 1 + (a n _\A n 1 + • • • + a,A + ciqI) = f(A) 


and the theorem is proved. 


Change of Basis 

The coordinate vector [v] s in this section will always denote a column vector; that is, 

Ms= k ,a 2 ,...,a n ] T 

6.13. Consider the following bases of R 2 : 

^ — ( e t) e 2 } — {(1)0), (0,1)} and S = {«!,n 2 } = {(1,3), (1,4)} 


(a) Find the change-of-basis matrix P from the usual basis E to S. 

(b) Find the change-of-basis matrix Q from S back to E. 

(c) Find the coordinate vector [n] of v = (5, —3) relative to S. 

(a) Because E is the usual basis, simply write the basis vectors in S as columns: P = 


(b) Method 1. Use the definition of the change-of-basis matrix. That is, express each vector in £ as a 
linear combination of the vectors in S. We do this by first finding the coordinates of an arbitrary vector 
v = ( a , b) relative to S. We have 

{a,b) =x(l,3) +y(l,4) = {x + y,3x+4y) or 3 ^ 4 ^!^ 

Solve for x and y to obtain x = 4a — b, y = —3 a + b. Thus, 

v = (4 a — b)u l + (—3a + b)u 2 and [v] s = [(a, £>)] s = [4a — /?, —3a + b] T 
Using the above formula for [t>] s and writing the coordinates of the e ; as columns yields 


€1 — (1,0) = 4mj — 3 u 2 
e 2 = (0,1) = —u 1 + u 2 


and 


Q = 


4 -1 
-3 1 


Method 2. Because Q = P 1 , find P 1 , say by using the formula for the inverse of a 2 x 2 matrix. 
Thus, 


p-' = 


4 -1 

-3 1 


(c) Method 1 . Write v as a linear combination of the vectors in S. say by using the above formula for 
v = (a, b). We have v = (5,-3) = 23m, — 18m 2 , and so [n] s = [23, — 18] r . 

Method 2. Use, from Theorem 6.6, the fact that [n] 5 = and the fact that [t;]^ = [5, — 3] r : 


Ms = p- 


*>*• = 


4 

-1 

5" 


23' 

-3 

1 

—3_ 


— 18 
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6.14. The vectors u { = (1,2,0), u 2 = (1,3,2), w 3 = (0,1,3) form a basis S of R 3 . Find 

(a) The change-of-basis matrix P from the usual basis E = {e u e 2 , e 3 } to S. 

(b) The change-of-basis matrix Q from S back to E. 

(a) Because E is the usual basis, simply write the basis vectors of S as columns: P = 


1 1 

2 3 
0 2 


(b) Method 1 . Express each basis vector of £ as a linear combination of the basis vectors of S by first 
finding the coordinates of an arbitrary vector v = (a, b , c) relative to the basis S. We have 


a 


V 


V 


"0" 

b 

— X 

2 

+ y 

3 

+ z 

1 

c 


0 


2 


3 


x + y = a 
2x + 3y + z = b 
2y+3z = c 


Solve for x, y, z to get x = la — 3b + c, y = — 6a + 3b — c, z = 4a — 2b + c. Thus, 
v = (a, b, c) = (la — 3b + c)u\ + (—6 a + 3 b — c)u 2 + (4 a — 2b + c)u 2 
or Ms = [( a i c)] s = [7a — 3b + c, —6a + 3b — c, 4a — 2b + c] T 

Using the above formula for Ms and then writing the coordinates of the e t as columns yields 


e x = (1,0,0)= lu l — 6 u 2 + 4m 3 


7 -3 


1 


e 2 = 

= (0,1, 

0 ) = 

- 

3 u x 

+ 3l<2 — 

2u 3 


and 

Q = 

-6 3 -1 

e 3 = 

o' 

i) = 


U\ 

— 

u 2 + 

u 3 





4 -2 1 

Find P 1 by 

row 

reducing M = 

P,I ] 

to 

the 

form 

[I-P-' 

]: 


'1 1 

0 1 

1 

0 

O' 


'1 

1 

0 

1 

O 

O 


M = 

2 3 

11 

0 

1 

0 

~ 

0 

1 

1 

-2 

1 0 



0 2 

3 

0 

0 

1 


0 

2 

3 

0 

0 1 



'1 

1 

0 1 

1 

1 

0 

O ' 


'1 

0 

0 

1 

7 

-3 

r 

0 

1 

1 

-2 

1 

0 

~ 

0 

1 

0 

-6 

3 

-1 

0 

0 

1 

4 

-2 

1 


0 

0 

1 

4 

-2 

i _ 


= [/,/>- 


Thus, Q = P~ l = 


7 -3 1 

-6 3 -1 

4 -2 1 


6.15. Suppose the x-axis and v-axis in the plane R 2 are rotated counterclockwise 45° so that the new 
x'-axis and y'-axis are along the line y — x and the line y — —x, respectively. 


(a) Find the change-of-basis matrix P. 

(b) Find the coordinates of the point A(5,6) under the given rotation, 
(a) The unit vectors in the direction of the new x'- and v'-axes are 


u x = (1 y/2, 5 y/2) and u 2 = (- \ \/l, \ V2) 

(The unit vectors in the direction of the original x and y axes are the usual basis of R 2 .) Thus, write the 
coordinates of u 1 and u 2 as columns to obtain 


P = 


'2V2 

_4V2 


4_V2 _ 


(b) Multiply the coordinates of the point by P 1 : 


IV 2 

\J~2 

'5' 



-IV 2 

\^2_ 

6 


JV2_ 


(Because P is orthogonal, P 1 is simply the transpose of P.) 




CHAPTER 6 Linear Mappings and Matrices 


215 


6.16. The vectors u 1 = (1,1,0), u 2 = (0,1,1), m 3 = (1,2,2) form abasis S of R 3 . Find the coordinates 
of an arbitrary vector v = (a, b. c) relative to the basis S. 

Method 1. Express v as a linear combination of u t . u 2 , u 3 using unknowns x,y,z- We have 
(a, b,c) =x(l, 1,0) + y(0,1,1) +z(l,2,2) = {x + z, x + y + 2z, y+2z) 
this yields the system 

x+ z=a x+ z=a x+ z=a 

x + y+2z = b or y + z = -a + b or y + z = -a + b 

y + 2 z = c y +2z = c z = a — b + c 

Solving by back-substitution yields x = b — c, y = —2a + 2b — c, z = a — b + c. Thus, 

Ms = [b — c, —2a + 2b — c, a — b + c] T 

Method 2. Find P by row reducing M = [P, /] to the form [7, P '], where P is the change-of-basis matrix 

from the usual basis E to S or, in other words, the matrix whose columns are the basis vectors of S. 


We have 


Thus, 



'1 0 

1 

1 0 

0" 


"l 

0 

11 

1 

0 

0" 






M = 

1 1 

1 

2 i 

0 1 

0 

~ 

0 

1 

1 

11 

-1 

1 

0 







0 1 

2 1 
^ i 

0 0 

1 


0 

1 

2 

0 

0 

1 







'1 0 

1 i 

1 


0 

O' 


'1 

0 

0 ' 

0 

1 

-1" 





0 1 

i 

i 

-1 


1 

0 

~ 

0 

1 

0 

-2 


2 

-1 


= ITT" 1 



O 

O 

i 

1 


-1 

1 


0 

0 

i; 

1 

- 


1_ 





0 

i 

-1' 








0 


1 

-f 


a 


b — c 

p- 1 = 

-2 

2 

-1 


and [v 

5 

= 

p 1 

Me 

= 

-2 


2 

-1 


b 

= 

—2a + 2 b — c 


1 

-1 

1 








1 

- 

1 

1 


c 


a — b + c 


6.17. Consider the following bases of R 2 : 

S= {ui,u 2 } = {(1,-2), (3,-4)} and 


■S' = {wi,« 2 } = {(1.3), (3,8)} 


(a) Find the coordinates of v — (a, b) relative to the basis S. 

(b) Find the change-of-basis matrix P from S to S'. 

(c) Find the coordinates of v = (a, b) relative to the basis S'. 

(d) Find the change-of-basis matrix Q from S' back to S. 

(e) Verify Q = P 1 . 

(f) Show that, for any vector v = (a. b) in R 2 , P 1 [v] s = [/;_ s _. (See Theorem 6.6.) 


(a) Let v = xu i + yu 2 for unknowns x and y; that is. 


1 

-2 


3 

-4 


x + 3y = a 
—2x — 4 y = b 


x+ 3y = a 

2y = 2 a + b 


Solve for x and y in terms of a and b to get x = —2a — \b and y = a + \b. Thus, 

(a, b) = (— 2a — |)«[ + (a + \b)u 2 or [(a, b)] s = [—2a — a + ^b] T 

(b) Use part (a) to write each of the basis vectors I), and v 2 of S' as a linear combination of the basis vectors 
u | and u 2 of S; that is. 


v l ~ (1, 3) — ( — 2 — |)«! + (1 + f)w 2 — — y u \ + f «2 

v 2 = (3,8) = (—6 — 12)«! + (3 + 4)m 2 = — 18m, + lu 2 
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Then P is the matrix whose columns are the coordinates of v l and v 2 relative to the basis S; that is. 


P = 


11 

2 

5 

2 


-18 


7 


(c) Let v = xv l + yv 2 for unknown scalars x and y. 


a 


' i' 


'3' 

x + 3y — a x + 3y — a 

b 

— X 

3 

+ y 

8 

° r 3*+ 8 y = b ° r —y — b — 3a 


Solve for x and y to get x = —8 a + 3 b and y = 3a — b. Thus, 

(a, b) = (—8a + 3b)v x + (3a — b)v 2 or [(a, b)] s , = [—8a + 3b, 3 a — b] T 


(d) Use part (c) to express each of the basis vectors u x and u 2 of S as a linear combination of the basis vectors 
v 1 and v 2 of S': 


u x = (1, —2) = (—8 — 6)vi + (3 + 2)v 2 = —14^! + 5v 2 
u 2 = (3, —4) = (—24 — 12)z>] + (9 + 4)v 2 = — 36v x + 13w 2 

f —14 

Write the coordinates of u x and u 2 relative to S' as columns to obtain Q = 


(e) 


QP = 


-14 

-36' 

I 

OO 

7 

2|<n 

1 


'i 

o' 

5 

13 

5 7 

L 2 ' J 


0 

1 


(f) Use parts (a), (c), and (d) to obtain 


'-14 

-36' 

—2a — \b 


— 8 a + 3b 

5 

13 

a + jb 


3a — b 


s 1 


6.18. Suppose P is the change-of-basis matrix from a basis {//,} to a basis { vv, }, and suppose Q is the 
change-of-basis matrix from the basis {w,} back to {«,}. Prove that P is invertible and that 
Q = P\ 

Suppose, for i = 1,2,..., n, that 

n 

W; = a n u \ + a a u 2 + • • • + a in u n = J2 a ij u j (!) 

j= i 


and, for j = 1,2,, n, 

n 

Uj = b jx w x + bp_w 2 + • • • + b jn w n = J2 bjk w k ( 2 ) 

k=l 

Let A = [a t j\ and B — [bj k \. Then P — A 7 and Q = B 7 . Substituting (2) into (1) yields 

n / n \ n / n \ 

w i = J2 a ij( E b jk w k = E E a ij b jk Wk 

j= 1 \k= 1 / k= 1 \j= 1 / 

Because {w ; } is a basis, a ij b jk = b ik> where d jk is the Kronecker delta; that is, S jk = 1 if i = k but 5 jk = 0 if 
i ^ k. Suppose AB = [c jk \. Then c ik = S ik . Accordingly, AB = /, and so 


QP = B t A t = ( AB) t = I T = I 


Thus, Q = P~\ 

6.19. Consider a finite sequence of vectors S— {u 1 ,u 2 , ■ ■ ■ ,u„}. Let S' be the sequence of vectors 
obtained from S by one of the following “elementary operations”; 

(1) Interchange two vectors. 

(2) Multiply a vector by a nonzero scalar. 

(3) Add a multiple of one vector to another vector. 

Show that S and S' span the same subspace W. Also, show that S' is linearly independent if and only 
if S is linearly independent. 
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Observe that, for each operation, the vectors S' are linear combinations of vectors in S. Also, because 
each operation has an inverse of the same type, each vector in S is a linear combination of vectors in S’. Thus, 
S and S' span the same subspace W. Moreover, S' is linearly independent if and only if dim W = n, and this is 
true if and only if S is linearly independent. 

6.20. Let A = [a J and B = [by] be row equivalent m x n matrices over a field K, and let tq, v 2 , ■ ■ ■, v„ be 
any vectors in a vector space V over K. For i = 1.2..... m. let u t and uy be defined by 

ui = a n Vi + a i2 v 2 + ■ ■ ■ + a in v n and w, = b n v x + b i2 v 2 + • • • + b in v n 

Show that {uj} and {uy} span the same subspace of V. 

Applying an “elementary operation" of Problem 6.19 to {//,} is equivalent to applying an elementary row 
operation to the matrix A. Because A and B are row equivalent, B can be obtained from A by a sequence of 
elementary row operations. Hence, {wy} can be obtained from {//,} by the corresponding sequence of 
operations. Accordingly, {«,} and {wy} span the same space. 

6.21. Suppose ui,u 2 , ■ ■ ■ ,u n belong to a vector space V over a field K, and suppose P= [ay] is an 
//-square matrix over K. For i = 1,2,...,//, let v t = a,\U ] + a i2 u 2 + ■ • • + a in u n . 

(a) Suppose P is invertible. Show that {//,} and {/;,} span the same subspace of V. Flence, {//, } is 
linearly independent if and only if {iy} is linearly independent. 

(b) Suppose P is singular (not invertible). Show that {iy} is linearly dependent. 

(c) Suppose {} is linearly independent. Show that P is invertible. 

(a) Because P is invertible, it is row equivalent to the identity matrix I. Hence, by Problem 6.19, {'/)/} and 
{//;} span the same subspace of V. Thus, one is linearly independent if and only if the other is linearly 
independent. 

(b) Because P is not invertible, it is row equivalent to a matrix with a zero row. This means {i> ; } spans a 
subspace that has a spanning set with less than n elements. Thus, {//,} is linearly dependent. 

(c) This is the contrapositive of the statement of part (b), and so it follows from part (b). 

6.22. Prove Theorem 6.6: Let P be the change-of-basis matrix from a basis S to a basis S' in a vector 
space V. Then, for any vector v € V, we have P[v] s , = [u] s , and hence, P _1 [f] s = [n] y . 

Suppose S = {/q and S' = (vtq,..., w„}, and suppose, for i = 1,...,/!, 

n 

wy = a n Ui + a i2 u 2 + • • • + a in u n = OyUj 

j= 1 

Then P is the //-square matrix whose jth row is 

(a lj ,a 2 j,...,a nj ) (1) 

Also suppose v = k l w l + k 2 w 2 + ■ • • + k n w n = J2'i=i K w i- Then 

[v] s , = [k u k 2 ,..^k n ] r (2) 

Substituting for w, in the equation for v, we obtain 

n n / n \ n / n \ 

'■ >] k i"'i = E k v ayUj = E E “ilk, Uj 

i= 1 1 V7=1 ) j=l \i= 1 ) 

n 

= E0l/'l + a 2j^2 "I-+ a njK) u j 

7=1 

Accordingly, [t/] s is the column vector whose /th entry is 

a ljk 1 + a 2jk 2 +-b a njK (3) 

On the other hand, the /th entry of P[u] y is obtained by multiplying the yth row of P by [u] y —that is, (1) by 
(2). However, the product of (1) and (2) is (3). Hence, P[u] y and [t/] y have the same entries. Thus, 

P[i>] y = [u] s , as claimed. 

Furthermore, multiplying the above by gives P _1 [t/] s = P~’P[t/] y = [u] y . 
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Linear Operators and Change of Basis 

6.23. Consider the linear transformation F on R 2 defined by F(x, y) = (5x — y, 2 jc + y) and the 
following bases of R 2 : 


E={e i,e 2 } = {(1,0), (0,1)} 


and 


S = {u l ,u 2 } — {(1,4), (2,7)} 


(a) Find the change-of-basis matrix P from E to S and the change-of-basis matrix Q from S back to 
E. 

(b) Find the matrix A that represents F in the basis E. 

(c) Find the matrix B that represents F in the basis S. 

(a) Because E is the usual basis, simply write the vectors in S as columns to obtain the change-of-basis matrix 
P. Recall, also, that Q = P 1 . Thus, 


P = 


1 2 
4 7 


and 


Q = P~ l = 


-7 2 

4 -1 


(b) Write the coefficients of x and y in F(x,y) = (5x — y, 2x + y) as rows to get 

A- I 5 - 1 
A | 2 j 


(c) Method 1 . Find the coordinates of F(u x ) and F(u 2 ) relative to the basis S. This may be done by first 
finding the coordinates of an arbitrary vector (a, b) in R 2 relative to the basis S. We have 


(a,b) = *(1,4) +y(2, 7) = (* + 2y, 4x + 7y), 


and so 


x + 2y = a 
4x + ly = b 


Solve for x and y in terms of a and b to get x = —la + 2b, y = 4 a — b. Then 

(a, b) = (—la + 2b)u l + (4 a — b)u 2 


Now use the formula for (a, b) to obtain 


F(u x ) = F(l,4) = (1,6) =5u x -2u 2 
F(u 2 ) = F(2,l) = (3.11) = «!+ u 2 


and so 



Method 2. 


By Theorem 6.7, B = P l AP. Thus, 


B = P X AP = 


-7 

2 

'5 

-l' 

'1 

2 


5 r 

4 

-1 

2 

1 

4 

1 


-2 1 


1 

1 


6.24. Let A = 


2 3 
4 -1 


. Find the matrix B that represents the linear operator A relative to the basis 

T rT-» _11 A J_J?! _ _ It_ __ A T%2 s ¥>2 


S — {m 1 ; u 2 } — {[1,3] , [2,5] }. [Recall A defines a linear operator A: R 2 — > R 2 relative to the 
usual basis E of R 2 ]. 

Method 1. Find the coordinates of A(u x ) and A(u 2 ) relative to the basis S by first finding the coordinates of 
an arbitrary vector [a, b] T in R 2 relative to the basis S. By Problem 6.2, 


[a, b] T = (—5a + 2 b)u 1 + (3 a — b)u 2 


Using the formula for [a, b] , we obtain 

A(« i) = 

and A(m 2 ) = 

Thus, 


'2 3' 

r 


11 

.4 -1. 

.3. 


1 

'2 3' 

'2' 


19' 

.4 -1. 

.5. 


3. 


= —53uj + 32 u 2 
= —89 + 54« 2 


B = 


-53 -89 
32 54 


Method 2. Use B = P 1 AP, where P is the change-of-basis matrix from the usual basis E to S. Thus, 
simply write the vectors in S (as columns) to obtain the change-of-basis matrix P and then use the formula 
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for P 1 . This gives 


Then 


P = 


B = P~ l AP = 


1 2 
3 5 


and 


P* 1 = 


-5 2 

3 -1 


'1 2' 

'2 

3' 

r—5 

2' 


'-53 

-89' 

.3 5 

4 

-1 

3 

-1 


32 

54. 


6.25. Let A = 

basis 


3 

5 

-2 


1 

-4 

2 


. Find the matrix B that represents the linear operator A relative to the 


S = {u l ,u 2 ,u 3 } = {[L l,0] r , [0,1, l] 7 , [1,2, 2] T } 

[Recall A that defines a linear operator A: R 3 —» R 3 relative to the usual basis E of R 3 .] 

Method 1. Find the coordinates of A(mj), A(u 2 ), A(m 3 ) relative to the basis S by first finding the 
coordinates of an arbitrary vector v = ( a,b,c ) in R 3 relative to the basis S. By Problem 6.16, 

Ms = (b — c)u x + (—2a + 2 b — c)u 2 + (a — b + c)m 3 

Using this formula for [a, b, c] r , we obtain 

A(ki) = [4,7, — l] r = 8m, + 7m 2 — 4 m 3 , A(m 2 ) = [4, l,0] r = m, — 6 u 2 + 3 m 3 

A(m 3 ) = [9,4, l] r = 3 m| — 11 u 2 + 6m 3 
Writing the coefficients of m 1 ,m 2 ,m 3 as columns yields 


B = 


8 1 3 

7 -6 -11 

-4 3 6 


Method 2. Use B = P l AP, where P is the change-of-basis matrix from the usual basis E to S. The matrix 
P (whose columns are simply the vectors in S ) and P" 1 appear in Problem 6.16. Thus, 


B = P'AP = 


0 

1 

-1' 


'1 

3 

f 


'1 

0 

f 


8 

1 

3" 

-2 

2 

-1 


2 

5 

-4 


1 

1 

2 

= 

7 

-6 

-11 

1 

-1 

1 


1 

-2 

2 


0 

1 

2 


-4 

3 

6 


6.26. Prove Theorem 6.7: Let P be the change-of-basis matrix from a basis S to a basis S' in a vector 
space V. Then, for any linear operator T on V. [T] s , = P i [T] s P. 

Let v be a vector in V. Then, by Theorem 6.6, P[r>] y = [v] s . Therefore, 

= p^iWs = = [T(«)] y 

But [P] y M y = [T(v)] s ,. Hence, 

^ _1 [r]s^[«]s' = \T\Ms' 

Because the mapping v i—>> [«] y is onto K", we have P~ l [T] s PX = [P] y X for every XeK". Thus, 
p- 1 [T] S P = [P] y , as claimed. 

Similarity of Matrices 


6.27. Let A = 


and P = 


4 -2 
3 6 

(a) Find S = P^'AP. (b) Verify tr(S) = tr(A). (c) Verify det(P) = det(A). 
(a) First find P l using the formula for the inverse of a 2 x 2 matrix. We have 

-2 1 ' 


p~' = 
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Then 


'-2 r 

'4 -2 

'1 2 


" 25 30' 

3 1 

. 2 2. 

3 6 

3 4 


,-T - 15 . 


(b) tr(A) = 4 + 6 = 10 and tr(B) = 25 — 15 = 10. Hence, tr(7?) = tr(A). 

(c) det(A) = 24 + 6 = 30 and det(B) = —375 + 405 = 30. Hence, det(7?) = det(A). 

6.28. Find the trace of each of the linear transformations F on R 3 in Problem 6.4. 

Find the trace (sum of the diagonal elements) of any matrix representation of F such as the matrix 
representation [F] = [F] £ of F relative to the usual basis E given in Problem 6.4. 

(a) tr(F) = tr([F]) = 1 - 5 + 9 = 5. 

(b) tr(F) = tr([F]) = 1 + 3 + 5 = 9. 

(c) tr(F) = tr([F]) =1+4 + 7=12. 

6.29. Write A w B if A is similar to B —that is, if there exists an invertible matrix P such that A = P l BP. 
Prove that sa is an equivalence relation (on square matrices); that is, 

(a) A«A, for every A. (b) If A « B, then B k, A. 

(c) If Aw B and B« C, then Aw C. 

(a) The identity matrix 7 is invertible, and 7 _1 = 7. Because A = 7 _1 A7, we have A ~ A. 

(b) Because A w B, there exists an invertible matrix P such that A = P 1 BP. Hence, 

B = PAP~ l = (P~ l )~ l AP~ i and P 1 is also invertible. Thus, B « A. 

(c) Because A w B, there exists an invertible matrix P such that A = P 1 BP, and as B « C. there exists an 
invertible matrix Q such that B = Q 1 CQ. Thus, 

A = P~ l BP = P~ l {Q~ l CQ)P = {P~ x Q~ l )C{QP) = (QP)~ l C(QP) 
and QP is also invertible. Thus, A « C. 

6.30. Suppose B is similar to A, say B = P l AP. Prove 

(a) B" = P~ l A n P, and so B" is similar to A". 

(b) f(B) = F _1 /(A)F, for any polynomial/(x), and so/(B) is similar to/(A). 

(c) B is a root of a polynomial g(x) if and only if A is a root of g(x). 

(a) The proof is by induction on n. The result holds for n = 1 by hypothesis. Suppose n > 1 and the result 

holds for n — 1. Then 

B" = BB"- 1 = (F^'AF)(F“ 1 A"“ 1 F) = F _1 A"F 

(b) Suppose f(x ) = a n x" + • • • + rqx + a 0 . Using the left and right distributive laws and part (a), we have 

P~ l f(A)P = P ~ 1 (a„A" H-b a±A + a 0 I)P 

= P~ 1 (a n A n )P + • • • + P~\ ai A)P + P~ l (a 0 I)P 
= a n (p- l A!'P ) + • • • + a l (P~ l AP) + a 0 (P~ l IP) 

— a n B n + • • • + UjB + Uq7 = f(B) 

(c) By part (b), g(B) = 0 if and only if P~ l g(A)P = 0 if and only if g(A) = F0F _1 = 0. 

Matrix Representations of General Linear Mappings 

6.31. Let F: R 3 —» R 2 be the linear map defined by F(x, y, z) = (3x + 2_y — 4 z, x — 5_y + 3 z)- 
(a) Find the matrix of F in the following bases of R 3 and R 2 : 

5={W 1 ,W 2 ,W 3 } = {(1,1,1), (1,1,0), (1,0,0)} and g = {u u u 2 } = {(1,3), (2,5)} 
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(b) Verify Theorem 6.10: The action of F is preserved by its matrix representation; that is, for any 
v in R 3 , we have [F]^,[?;] s = [F(v)] s ,. 

(a) From Problem 6.2, ( a,b) = (—5 a + 2 b)u l + (3a — b)u 2 . Thus, 

F{w\ ) = F{ 1,1,1) = (1,-1) = —lu x + 4m 2 
F(w 2 ) = F( 1,1,0) = (5,-4) = —33«! + 19m 2 
F(w 3 ) = F{ 1,0,0) = (3,1) = -13«! + 8m 2 


Write the coordinates of F(w ] ), F(w 2 ),F(w 3 ) as columns to get 

{F]s,s' = 


-7 -33 13 

4 19 8 


(b) If v = (. x,y,z ), then, by Problem 6.5, v = zw l + (y — z)w 2 + (x — y)w 3 . Also, 


F(v) = (3x + 2 y — 4z, x — 5 y + 3 z) = (— 13x — 20 y + 26z)u t + (8x + 11 y — 15z)m 2 


Hence, 


Thus, 


Ms = ( z , x-y) 1 


and 


[F(v)] s . = 


— 13x — 20v + 26z 
L 8x + 11v — 15z J 


[^1 




f-7 -33 —131 

r ^ i 


— 13x — 20y + 26c 

OO 

On 

i 

5-! ^ 
1 1 
V ^ 

i_ 


8x+ ll v - 15c 


= [F(v)] s 


6.32. Let F: R" — > R” ! be the linear mapping defined as follows: 

F(x !,x 2 , ...,x„) = {a n X! + • • • + a u x n , a 2 i^'i +-h V»> • • •, H-H a mn x n ) 

(a) Show that the rows of the matrix [/-’] representing F relative to the usual bases of R" and R m are 
the coefficients of the x ; in the components of F(x l ,... ,x n ). 

(b) Find the matrix representation of each of the following linear mappings relative to the usual 
basis of R": 

(i) F: R 2 —> R 3 defined by F(x,y ) = (3.x — y, 2x + Ay, 5x — 6 y). 

(ii) F: R 4 —> R 2 defined by F(x, y, s, t) = (3x — Ay + 2s — 5 1, 5x + 7 y — s — 21). 

(iii) F: R 3 —> R 4 defined by F(x, y, z) = (2.x + 3y - 8z, x + y + z, Ax - 5 z, 6y). 


(a) We have 


F(1,0,.. 

••,0) 

— ( a ll! a 21’ ■ 

• • 5 a m\ ) 



"«n 

a Y2 

■ a \n 

F(0,1,.. 

• •,0) 

= (a 12 , a 22 ,. 

• •, a m2 ) 

and thus, 

[*1 = 

a 2 t 

a 22 

■ a 2n 

F( 0,0,.. 

..,1) 

( a lni a 2ni • 

• • i dmn ) 



_ a m\ 

a,n2 ■ ■ 

• ®mn 


(b) By part (a), we need only look at the coefficients of the unknown x,y ,... in F(x,y ,...). Thus, 


(i) [F\ = 






OO 

1 

cn 

(N 

'3 -1" 

, (ii) m = 

'3 _4 2 -5' 

, (iii) [*1 = 

l l l 

2 4 

5 -6 

5 7-1-2 

4 0-5 





0 6 0 


6.33. Let A = 


2 5-3 

1 -4 7 

where vectors are written as columns. Find the matrix [F] that represents the mapping relative to the 


. Recall that A determines a mapping F: R 3 — R 2 defined by F(v) = Av, 


following bases of R and R“: 


2 . 


(a) The usual bases of R 3 and of R 2 . 

(b) S = {w 1 ,w 2 ,w 3 } = {(1,1,1), (1,1,0), (1,0,0)} and S' — {u l ,u 2 } — {(1,3), (2,5)}. 

(a) Relative to the usual bases, [F] is the matrix A. 
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(b) From Problem 6.2, (a, b) = (—5a + 2b)u l + (3a — b)u 2 . Thus 

1 

r? 5 -3i i 

F{w i) = 


1 -4 


F(w 2 ) = 


F(w 3 ) = 


5 -3 


1 -4 


5 -3 


1 -4 


— —12 U\ + Su 2 


7 

-3 

2 

1 


= —41 iq + 24 m 2 


— —8 ii^ T 5ui 


Writing the coefficients of F(vr 1 ), F(w 2 ), F(w 2 ) as columns yields [C] = 


-12 


6.34. Consider the linear transformation T on R 2 defined by T(x,y ) = (2x — 3y. 
following bases of R 2 : 


E = {e i,e 2 } = {(l,0), (0,1)} 


and 


S — — {(1)3), 


-41 -8 

24 5 ' 

x + Ay) and the 


(2,5)} 


(a) Find the matrix A representing T relative to the bases E and S. 

(b) Find the matrix B representing T relative to the bases S and E. 

(We can view T as a linear mapping from one space into another, each having its own basis.) 


(a) From Problem 6.2, ( a,b ) = (—5 a + 2 b)u l + (3a — b)u 2 . Hence, 

-8 23 

5 -13 


T{e l ) = T(1,0) = (2,1) = -8mj + 5u 2 

T(e 2 ) = T( 0,1) = (—3,4) = 23 u x - 13 u 2 


and so 


A = 


(b) We have 

-7 -11 

13 22 


r(« 1 ) = r(l,3) = (-7,13) = -7e l + \3e 2 
T(u 2 ) = T{ 2,5) = (-11,22) =-llej+22e 2 


6.35. How are the matrices A and B in Problem 6.34 related? 

By Theorem 6.12, the matrices A and B are equivalent to each other; that is, there exist nonsingular 
matrices P and Q such that B — Q~ ] AP, where P is the change-of-basis matrix from S to E, and Q is the 
change-of-basis matrix from E to S. Thus, 


and 





Q- 1 


'1 2 ' 
3 5 


'1 2' 

-8 

-23' 

'1 

2' 


'-7 

-11' 

.3 5_ 

5 

-13. 

.3 

5. 


. 13 

22. 


6.36. Prove Theorem 6.14: Let F:V^U be linear and, say, ran k(F) = r. Then there exist bases V and 
of U such that the matrix representation of F has the following form, where /,. is the r-squarc 
identity matrix: 


Suppose dim V = m and dim U = n. Let W be the kernel of F and U' the image of F. We are given that 
rank (F) = r. Hence, the dimension of the kernel of F is m — r. Let {w l w m _ r ) be a basis of the kernel of 
F and extend this to a basis of V: 


Set 


{v 1 ,...,v r ,w 1 ,...,w m _ r } 

Ml =F(v 1 ), u 2 = F(v 2 ), ..., u r = F(v r ) 
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Then (iq ,..., u,.} is a basis of U', the image of F. Extend this to a basis of U, say 

{ M l> • • • , u r+ 1j • • ■ > u n} 


Observe that 


F(v i) 

= U\ 

= 1 U | 

+ 

o U 2 

+ • 

• + 

0 u r 

+ 

0u r+1 

+ • 

• + 

0m„ 

F(v 2 ) 

= u 2 

= 0«j 

+ 

lu 2 

+ • 

• + 

0 u r 

+ 

0u r+1 

+ • 

• + 

0 u„ 

F(v r ) 

= «r 

= 0«| 

+ 

0 u 2 

+ • 

• + 

1 u r 

+ 

0u r+l 

+ • 

• + 

0 K 

F(w x ) 

= 0 

= 0«| 

+ 

0w 2 

+ • 

• + 

0u r 

+ 

0m,. + i 

+ • 

• + 

0 l, n 

F{w m -r) 

= 0 

= 0 u { 

+ 

0m 2 

+ • 

• + 

On,. 

+ 

0m,. + i 

+ • 

• + 

0 u „ 


Thus, the matrix of F in the above bases has the required form. 


SUPPLEMENTARY PROBLEMS 


Matrices and Linear Operators 

6.37. Let F: R 2 —> R 2 be defined by F(x,y) = (4x + 5y, 2x — y). 

(a) Find the matrix A representing F in the usual basis E. 

(b) Find the matrix B representing F in the basis S = {m 1 ,« 2 } = {(1,4), (2,9)}. 

(c) Find P such that B = P~ X AP. 

(d) For v = ( ci,b ), find [t)] s and [F(v)] s . Verify that [T’] i5 ['u] iS = [F(t;)] iS . 

t r 5 

6.38. Let A: R" —> R~ be defined by the matrix A= 1 . 

(a) Find the matrix B representing A relative to the basis S = {n 1 ,M 2 } = {(1,3), (2,8)}. (Recall that A 
represents the mapping A relative to the usual basis E.) 

(b) For v = (. a,b ), find [v\ s and [A(i>)] s . 

6.39. For each linear transformation L on R 2 , find the matrix A representing L (relative to the usual basis of R 2 ): 

(a) L is the rotation in R 2 counterclockwise by 45°. 

(b) L is the reflection in R 2 about the line y = x. 

(c) L is defined by L(1,0) = (3,5) and L(0,1) = (7,-2). 

(d) L is defined by L(l, 1) = (3,7) and L(l,2) = (5,-4). 

6.40. Find the matrix representing each linear transformation T on R 3 relative to the usual basis of R 3 : 

(a) T(x,y,z) = (x,y,0). (b) T(x,y,z) = (z, y + z, x + v + z). 

(c) T(x, y, z) = (2x - ly - 4z, 3 x + y + 4z, 6x - 8y + z). 

6.41. Repeat Problem 6.40 using the basis S = {iq, k 2 , u 3 } = {(1,1,0), (1,2,3), (1,3,5)}. 

6.42. Let L be the linear transformation on R 3 defined by 

£(1,0,0) = (1,1,1), L(0,1,0) = (1,3,5), £(0,0,1) = (2,2,2) 

(a) Find the matrix A representing L relative to the usual basis of R 1 . 

(b) Find the matrix B representing L relative to the basis S in Problem 6.41. 

6.43. Let D denote the differential operator; that is, D(/(f)) = df /dt. Each of the following sets is a basis of a 
vector space V of functions. Find the matrix representing D in each basis: 

(a) {e',e 2t ,te 2r }. (b) {l,f, sin3t, cos3f}. (c) {e 5 ‘, te 5r , t 2 e 5 '}. 
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6.44. Let D denote the differential operator on the vector space V of functions with basis S = {sin0, cos0}. 

(a) Find the matrix A = [D] s . (b) Use A to show that D is a zero of /(f) = r + 1. 

6.45. Let V be the vector space of 2 x 2 matrices. Consider the following matrix M and usual basis E of V: 


M = 


a 

c 


b 

d 


and 


E = 


'1 0 ‘ 


'0 f 

0 0 


0 0 


0 0 
1 0 ’ 


0 0 
0 1 


Find the matrix representing each of the following linear operators T on V relative to E: 

(a) T(A) = MA. (b) T(A) = AM. (c) T(A) = MA - AM. 


6.46. Let l v and 0 V denote the identity and zero operators, respectively, on a vector space V. Show that, for any 
basis S of V, (a) [1^]$ = /, the identity matrix. (b) [0 y ] s = 0, the zero matrix. 


Change of Basis 

6.47. Find the change-of-basis matrix P from the usual basis E of R 2 to a basis S. the change-of-basis matrix Q from 
S back to E , and the coordinates of v = (a, b) relative to S, for the following bases S: 

(a) S = {(1,2), (3,5)}. (c) 5= {(2,5), (3,7)}. 

(b) 5 ={(1,-3), (3,-8)}. (d) 5= {(2,3), (4,5)}. 

6.48. Consider the bases S = {(1,2), (2, 3)} and 5'= {(1,3), (1,4)} of R 2 . Find the change-of-basis matrix: 

(a) P from S to S'. (b) Q from S' back to S. 

6.49. Suppose that the x-axis and y-axis in the plane R 2 are rotated counterclockwise 30° to yield new x'-axis and 
y '-axis for the plane. Find 

(a) The unit vectors in the direction of the new x'-axis and y'-axis. 

(b) The change-of-basis matrix P for the new coordinate system. 

(c) The new coordinates of the points A(l, 3), B{ 2, —5), C(a, b). 

6.50. Find the change-of-basis matrix P from the usual basis E of R 3 to a basis S, the change-of-basis matrix Q from 

S back to E , and the coordinates of v = ( a , b, c) relative to S, where S consists of the vectors: 

(a) m, = (1,1,0), u 2 = (0,1,2), « 3 = (0, 1, 1). 

(b) «, = (1,0,1), u 2 = (1,1,2), « 3 = (1,2,4). 

(c) M, = (1,2,1 ),«2 = (1,3,4),w 3 = (2,5,6). 

6.51. Suppose S u S 2l Si are bases of V. Let P and Q be the change-of-basis matrices, respectively, from 5) to S 2 and 
from S 2 to S 2 . Prove that PQ is the change-of-basis matrix from 5) to S 3 . 


Linear Operators and Change of Basis 

6.52. Consider the linear operator F on R 2 defined by F(x,y) = (5x + y. 2>x — 2 y) and the following bases of R 2 : 

5= {(1,2), (2,3)} and 5'= {(1,3), (1,4)} 

(a) Find the matrix A representing F relative to the basis S. 

(b) Find the matrix B representing F relative to the basis S'. 

(c) Find the change-of-basis matrix P from S to S'. 

(d) How are A and B related? 


6.53. 


Let A: R 2 —> R 2 be defined by the matrix A = 
A relative to each of the following bases: (a) 


^ n J. Find the matrix B that represents the linear operator 

S={( l,3) r , (2,5) r }. (b) S = {(l,3) r , (2,4) r }. 
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6.54. Let F: R 2 —> R 2 be defined by F(x,y) = (x — 3y, 2x — 4 y). Find the matrix A that represents F relative to 
each of the following bases: (a) S ={(2,5), (3,7)}. (b) S = {(2,3), (4,5)}. 


6.55. 


Let A: R 3 —> R 3 be defined by the matrix A = 
operator A relative to the basis S = {(1.1, l) r , 


1 3 1 

2 7 4 
1 4 3 

T 


Find the matrix B that represents the linear 


(0,1,1)', (1,2,3)'}. 


Similarity of Matrices 


6.56. Let A = 

(a) Find B = P^'AP. (b) Verify that tr(B) = tr(A). (c) Verify that det(B) = det(A). 

6.57. Find the trace and determinant of each of the following linear maps on R 2 : 

(a) F(x,y) = (2x — 3v, 5x + 4y). (b) G(x,y) = (ax + by, cx + dy). 

6.58. Find the trace and determinant of each of the following linear maps on R 3 : 

(a) F(x, y, z) = (x + 3y, 3x-2z, x-Ay-3z). 

(b) G(x, y, z) = (y + 3z, 2x - 4z, 5x + ly). 


and P = 


1 -2 
3 -5 


6.59. Suppose S = (w,, u 2 } is a basis of V, and T: V —> V is defined by T(u l ) = 3 u 1 — 2u 2 and T(u 2 ) = u x + 4u 2 . 
Suppose S' = (w'!, w 2 } is a basis of V for which Wj = U\ + u 2 and w 2 = 2«j + 3 u 2 . 

(a) Find the matrices A and B representing T relative to the bases S and S', respectively. 

(b) Find the matrix P such that B = P~ l AP. 


6.60. Let A be a 2 x 2 matrix such that only A is similar to itself. Show that A is a scalar matrix, that is, that 


6.61. Show that all matrices similar to an invertible matrix are invertible. More generally, show that similar matrices 
have the same rank. 


Matrix Representation of General Linear Mappings 

6.62. Find the matrix representation of each of the following linear maps relative to the usual basis for R": 

(a) F: R 3 —► R 2 defined by F(x, y, z) = (2x — 4v + 9z, 5x + 3y — 2 z). 

(b) F: R 2 R 4 defined by F(x,y ) = (3x + Ay, 5x — 2y, x + ly, Ax). 

(c) F: R 4 — > R defined by F(x l ,x 2 ,x 3 ,x 4 ) = 2x 1 + x 2 — lx 3 — x 4 . 

6.63. Let G: R 3 —> R 2 be defined by G(x, y, z) = (2x +3 y — z, Ax — y + 2z). 

(a) Find the matrix A representing G relative to the bases 

5T= {(1,1,0), (1,2,3), (1.3,5)} and S'= {(1,2), (2,3)} 

(b) For any v = ( a,b,c ) in R 3 , find [d] s and [G(v)] y . (c) Verify that A[tt] s = [G(t;)] y . 

6.64. Let H: R 2 — > R 2 be defined by H(x,y) = (2x + ly, x — 3y) and consider the following bases of R 2 : 

S = {(1,1), (1,2)} and S'= {(1,4), (1,5)} 


(a) Find the matrix A representing H relative to the bases S and S'. 

(b) Find the matrix B representing H relative to the bases S' and S. 
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6.65. Let F: R 3 R 2 be defined by F(x,y, z) = (2x + y — z, 3x — 2y + 4z). 

(a) Find the matrix A representing F relative to the bases 

5= {(1,1,1), (1,1,0), (1,0,0)} and S' = (1,3), (1,4)} 

(b) Verify that, for any v = (a, b, c) in R 3 , A[v] s = [F(t;)] s ,. 

6 . 66 . Let S and S' be bases of V, and let l v be the identity mapping on V. Show that the matrix A representing l v 
relative to the bases S and S' is the inverse of the change-of-basis matrix P from S to S'\ that is, A = P^ 1 . 

6.67. Prove (a) Theorem 6.10, (b) Theorem 6.11, (c) Theorem 6.12, (d) Theorem 6.13. [Hint: See the proofs 
of the analogous Theorems 6.1 (Problem 6.9), 6.2 (Problem 6.10), 6.3 (Problem 6.11), and 6.7 
(Problem 6.26).] 

Miscellaneous Problems 

6 . 68 . Suppose F: V —> V is linear. A subspace W of V is said to be invariant under F if F( W) C W. Suppose W is 

invariant under F and dim W = r. Show that F has a block triangular matrix representation M = 
where A is an r x r submatrix. 

6.69. Suppose V = U + W, and suppose U and V are each invariant under a linear operator F: V —> V. Also, 

suppose dim U = r and dim W = S. Show that F has a block diagonal matrix representation M = 
where A and B are r x r and s x s submatrices. 

6.70. Two linear operators F and G on V are said to be sii?iilar if there exists an invertible linear operator T on V 
such that G = T 1 o F o T. Prove 

(a) F and G are similar if and only if, for any basis S of V, [F] s and [G\ s are similar matrices. 

(b) If F is diagonalizable (similar to a diagonal matrix), then any similar matrix G is also diagonalizable. 




ANSWERS TO SUPPLEMENTARY PROBLEMS 


Notation: M =[R { \ R 2 ] ...] represents a matrix M with rows R l ,R 2 ,.... 

6.37. (a) A =[4,5; 2,-1]; (b) B= [220,487; -98,-217]; (c) P=[ 1,2; 4,9]; 

(d) [d] s = [9a — 2b, —4a + b] T and [F(v)] s = [32a + 47i>, — 14a — 21 b] T 

6.38. (a) B =[-6,-28; 4,15]; 

(b) [tt] s = [4a — b, — \a + }£>] r and [A(v)] s = [18a — 8b, }(—13a + 7Z>)] 

6.39. (a) [V2,-V2; y/2,y/2]; (b) [0,1; 1.0]; (c) [3,7; 5,-2]; 

(d) [1,2; 18,-11] 

6.40. (a) [1,0,0; 0,1,0; 0,0,0]; (b) [0,0,1; 0,1.1; 1,1,1]; 

(c) [2,-7,-4; 3,1,4; 6,-8,1] 

6.41. (a) [1,3,5; 0,-5,-10; 0,3,6]; (b) [0,1,2; -1,2,3; 1,0,0]; 

(c) [15,65,104; -49,-219,-351; 29.130,208] 

6.42. (a) [1,1,2; 1,3,2; 1,5,2]; (b) [0; 2,14,22; 0,-5,-8] 


6.43. (a) [1,0,0; 0,2,1; 0,0,2]; (b) [0.1,0,0; 0; 0,0,0,-3; 0,0,3,0]; 

(c) [5,1,0; 0,5,2; 0,0,5] 
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6.44. 

(a) 

A = 

[0,-1; 

1,0]; 

(b) A 2 + / = 0 




6.45. 

(a) 

[a,0, 

M; 0 

,a,0,b; 

c, 0, d, 0; 

0, c, 0, c/]; 





(b) 

[a,c, 

0,0; b 

,d, 0,0; 

0,0,a,c; 

0,0, b, d\; 





(c) 

[o,- 

c,b, 0; 

— b, a — 

d, 0, b; c. 

, 0, d — a, — i 

c; 0, c, - 

-b,0] 


6.47. 

(a) 

[1,3; 

2,5], 

[-5,3; 

2,-1], 

M = [- 

■5a + 3b, 

2 a - 

-b ] T ; 


(b) 

[1,3; 

: 3. 

8], h 

8,-3; 3, 

1], [v] - 

= [—8a — 

3b, 

3a + b\ 


(c) 

[2,3; 

5,7], 

[-7,3; 

5,-2], 

[v] = [- 

la + 3b, 

5 a - 

-2 b] T ; 


(d) 

[2,4; 

3,5], 

[-1.2 

; I,-!], 

[V] = [- 

- + 2Z? 

, \a 

-b] T 

6.48. 

(a) 

P = 

[3,5; - 

1,-2]; 

(b) Q 

= [2,5; - 

-1,-3] 




6.49. Here K = V 3. 

(a) 1(-1,*); 

(b) P = \[K,~ 1; IF]; 

(c) ±[F + 3,3F- 1] , ±[2F-5,-5F-2] r , \[aK + b,bK - a] T 

6.50. P is the matrix whose columns are u l , u 2 , m 3 . Q = P ',[«] = Q[a, b, c\ T . 



(a) 

<2= [1, 

0 ,0; 1,-1,1; 

-2, 

2 ,-1], 

[v\ = [a, 

a — b + c, —2a + 2 b — c\ T ; 


(b) 

Q = [0, 

-2,1; 2,3,-2; 

- 

1 ,M1,1], 

M = [- 

2 b + c, 2a + 3b —2c, —a — b + c] T ; 


(c) 

<2= [- 

2,2,-1; -7,4, 

-1: 

5,-3,1], 

H = [ 

—2a + 2 b — c, —la + 4 b — c, 5a — 3b + c] r 

6.52. 

(a) 

[-23, - 

39; 15,26]; 

(b) 

[35,41; 

-27, -32]; 

(c) [3,5; -1,-2]; (d) B = P~ l AP 

6.53. 

(a) 

[28,47; 

-15,-25]; 

(b) 

[13,18; 

-f,-10] 


6.54. 

(a) 

[43,60; 

-33,-46]; 

(b) 

\ (3,7; - 

-5,-9] 



6.55. [10,8,20; 13,11,28; -5,-4,-10] 


6.56. (a) [-34,57; -19,32]; (b) tr(fl) = tr(A) =-2; (c) det(fi) = det(A) = -5 

6.57. (a) tr(F) = 6, det(F) = 23; (b) tr(G) = a + d, det(G) = ad — be 

6.58. (a) tr(F) = -2, det(F) = 13; (b) tr(G) = 0, det(G) = 22 

6.59. (a) A = [3,1; -2,4], B= [8,11; -2,-1]; (b) P=[l,2; 1,3] 

6.62. (a) [2,-4,9; 5,3,-2]; (b) [3,5,1,4; 4,-2,7,0]; (c) [2,1,-7,-1] 

6.63. (a) [—9,1,4; 7,2,1]; (b) [v\ s = [—a + 2b — c, 5a — 5b + 2c, — 3a + 3b — c] T , and 

[G(v)] s , = [2a-llb + 7c, lb - 4c] r 

6.64. (a) A =[47,85; -38,-69]; (b) B= [71,88; -41,-51] 

6.65. A = [3,11,5; -1,-8,-3] 



CHAPTER 7 



Inner Product Spaces, 

Orthogonality 


7.1 Introduction 


The definition of a vector space V involves an arbitrary field K. Here we first restrict K to be the real field 
R, in which case V is called a real vector space', in the last sections of this chapter, we extend our results to 
the case where K is the complex field C, in which case V is called a complex vector space. Also, we adopt 
the previous notation that 

u, v, w are vectors in V 
a, b, c , k are scalars in K 

Furthermore, the vector spaces V in this chapter have finite dimension unless otherwise stated or implied. 

Recall that the concepts of “length” and “orthogonality” did not appear in the investigation of arbitrary 
vector spaces V (although they did appear in Section 1.4 on the spaces R" and C"). Here we place an 
additional structure on a vector space V to obtain an inner product space, and in this context these concepts 
are defined. 


7.2 Inner Product Spaces 


We begin with a definition. 

DEFINITION: Let V be a real vector space. Suppose to each pair of vectors u, v € V there is assigned a 


real number, denoted by (u, v). This function is called a ( real) inner product on V if it 
satisfies the following axioms: 

[Ij] (Linear Property ): (. au l + bu 2 , v) = a(u 1 , v) + b(u 2 , v). 

[1 2 ] (Symmetric Property)', (u, v) = (v, u). 

[1 3 ] (Positive Definite Property ): (u, u) > 0.; and (it, u) — 0 if and only if u — 0. 
The vector space V with an inner product is called a (real) inner product space. 


Axiom [Ij] states that an inner product function is linear in the first position. Using [I,] and the 
symmetry axiom [I 2 ], we obtain 

( u , cv l + dv 2 ) — {cvi + dv 2 , u) — c(v 1 , u) + d(v 2 , u) — c(u, + d(u, v 2 ) 

That is, the inner product function is also linear in its second position. Combining these two properties and 
using induction yields the following general formula: 
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That is, an inner product of linear combinations of vectors is equal to a linear combination of the inner 
products of the vectors. 

EXAMPLE 7.1 Let V be a real inner product space. Then, by linearity, 

(3Mj — 4u 2 , 2v x — 5v 2 + 6u 3 ) = 6 (u l , v { ) — 15(m 1 , v 2 ) + 18(mj, v 3 ) 

~ 8(h 2 , V + 20 (u 2 , v 2 ) - 24 (u 2 , v 3 ) 

(2m — 5v, 4 u + 6v) = 8 (u, u) + 12(«, n) — 20(u, m) — 30(u, v) 

= 8 (u, u) — S(v, u) — 30(u, v) 


Observe that in the last equation we have used the symmetry property that (u , u) = (i>, u). 

Remark: Axiom [I,] by itself implies (0,0) = (On, 0) = 0(v, 0) = 0. Thus, [I 1 ], [I 2 ], [I 3 ] are 
equivalent to [Ij, [I 2 ], and the following axiom: 

[I' 3 ] If u^O, then (u, u) is positive. 

That is, a function satisfying [Ij, [I 2 ], [I 3 ] is an inner product. 

Norm of a Vector 

By the third axiom [I 3 ] of an inner product, (u, u) is nonnegative for any vector u. Thus, its positive square 
root exists. We use the notation 

||n|| = a/ (u, u) 

This nonnegative number is called the norm or length of u. The relation ||m|| 2 = (. u,u ) will be used 
frequently. 

Remark: If \\u\\ = 1 or, equivalently, if ( u, u) — 1, then u is called a unit vector and it is said to be 
normalized. Every nonzero vector v in V can be multiplied by the reciprocal of its length to obtain the unit 
vector 

1 

v = 7i—n v 

\\v\\ 

which is a positive multiple of v. This process is called normalizing v. 


7.3 Examples of Inner Product Spaces 


This section lists the main examples of inner product spaces used in this text. 

Euclidean n-Space R" 

Consider the vector space R". The dot product or scalar product in R" is defined by 
u ■ v — a l b l + a 2 b 2 + ■ ■ • + a n b n 

where u — («,) and v = (£>,). This function defines an inner product on R". The norm ||n|| of the vector 
u = (a,) in this space is as follows: 

||u|| = \/m ■ u = \ja j + a\ + ■ ■ • + a~ 

On the other hand, by the Pythagorean theorem, the distance from the origin O in R 3 to a point 
P(a,b,c ) is given by s/a 2 + b 2 + c 2 . This is precisely the same as the above-defined norm of the 
vector v= ( a,b,c ) in R 3 . Because the Pythagorean theorem is a consequence of the axioms of 
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Euclidean geometry, the vector space R" with the above inner product and norm is called Euclidean 
n-space. Although there are many ways to define an inner product on R", we shall assume this 
inner product unless otherwise stated or implied. It is called the usual (or standard ) inner product 
on R" 


Remark: Frequently the vectors in R" will be represented by column vectors—that is, by n x I 
column matrices. In such a case, the formula 

( u , v) = u T v 

defines the usual inner product on R". 

EXAMPLE 7.2 Let u=( 1.3,-4,2), v = (4,-2,2,1), w = (5,-1, -2,6) in R 4 . 

(a) Show (3 u — 2v,w) = 3 (u,w) — 2{v,w). 

By definition, 

(u, w) — 5 — 3 + 8 +12 = 22 and (v, w) = 20 + 2 — 4 + 6 = 24 
Note that 3m — 2v = (—5,13, —16,4). Thus, 

(3m - 2v, w) = -25 - 13 + 32 + 24 = 18 
As expected, 3 (w,w) — 2{v,w) = 3(22) — 2(24) = 18 = (3m — 2v, w). 

(b) Normalize u and v. 

By definition, 

|| w ||=vTT9TT6T4 = v/ 30 and ||u|| = v/l6 + 4 + 4+ l= 5 


We normalize u and v to obtain the following unit vectors in the directions of u and v, respectively: 


1 _ / 1 3 -4 2 \ 

4 U ~ V30’730 ; 730'7307 


and 



'4 ^2 2 1 \ 
5’ 5 ’5’5/ 


Function Space C[a,ib] and Polynomial Space P(t) 

The notation C[a,b\ is used to denote the vector space of all continuous functions on the closed interval 
[a, b \—that is, where a < t < b. The following defines an inner product on C[a, b\, whcre/(f) and g(t) are 
functions in C[a. b}: 


(f,g) 


■b 


f(t)g(t ) dt 


It is called the usual inner product on C[a,b\. 

The vector space P(f) of all polynomials is a subspace of C[a,b\ for any interval [a , b\. and hence, the 
above is also an inner product on P(f). 


EXAMPLE 7.3 


Consider/(t) = 3t — 5 and g(t) = t 2 in the polynomial space P(f) with inner product 


(/»«> 


f(t)g(t) dt. 


0 


(a) Find (f,g). 

We have/(f)g(f) = 3 1 3 — 5t 2 . Hence, 

l 

_ 3 _ 5 _ _ ii 
~ 4 3 — 12 

0 


(/>s) = 


(3t 3 - 5t 2 ) dt = \t A -\r 
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(b) Find ||/|| and \\g\\ 

We have [/(f)] =/(f)/(f) = 9 1 2 — 30f + 25 and [g(f)]" = t 4 . Then 


= (f,f) = 


1*11 = (*>*) = 


(9r - 30f + 25) * = 3f 3 - 15r + 25/ 


= 13 


f 4 df = 5 f 5 


Therefore, ||/|| = \/l3 and 


= 5/5 = 5 ^. 


Matrix Space M = M m „ 

Let M = M m n , the vector space of all real m x n matrices. An inner product is defined on M by 

(A, B) = tr (B t A) 

where, as usual, tr( ) is the trace—the sum of the diagonal elements. If A = [aJ and B = [b-2 , then 

m n m n 

(A, B) = tr (B t A) = E E a ij b ij and \\M\ = ( A r A) = E E 4 

j=iy'=i i=ij=i 

That is, (A, B) is the sum of the products of the corresponding entries in A and B and, in particular, (A, A) is 
the sum of the squares of the entries of A. 


Hilbert Space 

Let V be the vector space of all infinite sequences of real numbers (aj ,a 2 ,a^, ...) satisfying 

OO 

E 2 2 i 2 i - 

Cli — Cl\ &2 + ‘ < OO 

i=l 

that is, the sum converges. Addition and scalar multiplication are defined in V componentwise; that is, if 

u = (a u a 2 ,.. .) and v = (b u b 2 ,.. .) 
then u + v = ( a l + b x , a 2 + b 2 , ...) and ku = ( ka x , kci 2 ,.. .) 

An inner product is defined in v by 

(u, v) = a l b l + a 2 b 2 - 

The above sum converges absolutely for any pair of points in V. Hence, the inner product is well defined. 
This inner product space is called l 2 -space or Hilbert space. 


7.4 Cauchy-Schwarz Inequality, Applications 

The following formula (proved in Problem 7.8) is called the Cauchy-Schwarz inequality or Schwarz 
inequality. It is used in many branches of mathematics. 

THEOREM 7.1: (Cauchy-Schwarz) For any vectors u and v in an inner product space V. 

(«, v) 2 < (u, u)(v, v) or |(w, w)| < IMIIMI 
Next we examine this inequality in specific cases. 

EXAMPLE 7.4 

(a) Consider any real numbers a x ,..., a n , b 1 ,..., b n . Then, by the Cauchy-Schwarz inequality, 

+ a 2 b 2 + ■ • • + a n b n )~ < (af + ■ ■ ■ + af t )[bj + • ■ ■ + bfy 
That is, (u ■ v) 2 < ||m|| 2 ||u|| 2 , where u = (a,-) and v = ( b t ). 
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(b) Let / and g be continuous functions on the unit interval [0,1]. Then, by the Cauchy-Schwarz inequality, 


f{t)g{t) dt 


< 


(fit) 

Jo 


dt 


g 2 {t) dt 


That is, (( f,g )) 2 < || /|| 2 ||t'|| 2 - Here V is the inner product space C[0,1]. 

The next theorem (proved in Problem 7.9) gives the basic properties of a norm. The proof of the third 
property requires the Cauchy-Schwarz inequality. 

THEOREM 7.2: Let V be an inner product space. Then the norm in V satisfies the following 
properties: 

[N[] ||t;|| > 0; and ||t;|| — 0 if and only if v = 0. 

[n 2 ] ||MI = I*IIMI- 

[N 3 ] ||w + t;|| < ||w|| + ||t;||. 

The property [N 3 ] is called the triangle inequality, because if we view u -\- v as the side of the triangle 
formed with sides u and v (as shown in Fig. 7-1), then [N 3 ] states that the length of one side of a triangle 
cannot be greater than the sum of the lengths of the other two sides. 



Triangle Inequality 

Figure 7-1 

Angle Between Vectors 

For any nonzero vectors u and v in an inner product space V, the angle between u and v is defined to be the 
angle 6 such that 0 < 6 < n and 

a (“> v ) 

cos y = . 


Ml m 


By the Cauchy-Schwartz inequality, —1 < cos 9 < 1, and so the angle exists and is unique. 

EXAMPLE 7.5 

(a) Consider vectors u = (2,3, 5) and v = (1, —4, 3) in R 3 . Then 

(m,«) = 2-12 + 15 = 5, ||u|| = V4 + 9 + 25 = V38, \\v\\ = \fTTl6 + 9 = V26 

Then the angle 6 between u and v is given by 
cos 6 = 


V38V26 

Note that 9 is an acute angle, because cos 9 is positive. 

(b) Let /(f) = 3f — 5 and g(t) = t 2 in the polynomial space P(f) with inner product (f,g) = J 0 * f{t)g{t ) dt. By 
Example 7.3, 

(/,*> =ll/ll=v^3. kll=iV5 


Then the “angle” 6 between / and g is given by 
— — 55 

cos 6 = 


(a/I3)(|\/5) l2WiV5 


Note that 9 is an obtuse angle, because cos 9 is negative. 
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7.5 Orthogonality 


Let V be an inner product space. The vectors u, v 6 V are said to be orthogonal and u is said to be 
orthogonal to v if 

( u, v) = 0 

The relation is clearly symmetric—if u is orthogonal to v, then (v, u) = 0, and so v is orthogonal to u. We 
note that 0 G V is orthogonal to every v £ V. because 

( 0 , v) = (Ov, v) — 0(v, v) = 0 

Conversely, if u is orthogonal to every v € V, then (u. u) = 0 and hence u = 0 by [I 3 ]. Observe that u and v 
are orthogonal if and only if cos 0 = 0, where 0 is the angle between u and v. Also, this is true if and only 
if u and v are “perpendicular”—that is, 0 = 7 i /2 (or 0 = 90°). 


EXAMPLE 7.6 

(a) Consider the vectors u = (1,1,1), v = (1,2, —3), w = (1, —4,3) in R 3 . Then 

(«, v) = 1 + 2 — 3 = 0, (u, vv) = 1 — 4 + 3 = 0, (v, w) = 1 — 8 — 9 = —16 


Thus, u is orthogonal to v and w, but v and vv are not orthogonal. 

(b) Consider the functions sin t and cos t in the vector space C[—n, n] of continuous functions on the closed interval 
[—71,7i]. Then 


(sin t, cos t) 


*7T 

sin t cos t dt = \ sin 2 t\[_ n = 0 — 0 = 0 

— 71 


Thus, sin? and cos? are orthogonal functions in the vector space C[—n,n]. 

Remark: A vector vv = (x 1 ,x 2 , ..., x n ) is orthogonal to u = (a l ,a 2 , • • •, a n ) in R" if 
(u, w) = a x x x + a 2 x 2 + • • • + a n x n = 0 


That is, w is orthogonal to u if w satisfies a homogeneous equation whose coefficients are the elements 
of u. 


EXAMPLE 7.7 Find a nonzero vector vv that is orthogonal to ?q = (1,2,1) and u 2 = (2,5,4) in R 3 . 

Let w = (x,y,z). Then we want (u l ,w) = 0 and ( u 2 ,w ) = 0. This yields the homogeneous system 

x + 2y + z — 0 x+2 _y+z = 0 

2x + 5y + 4z — 0 ° r y + 2z = 0 

Here z is the only free variable in the echelon system. Set z = 1 to obtain y = —2 and x = 3. Thus, w = (3, —2, 1) is a 
desired nonzero vector orthogonal to ?q and u 2 . 

Any multiple of vv will also be orthogonal to u l and u 2 . Normalizing vv, we obtain the following unit vector 
orthogonal to vq and u 2 : 


w / 3 
IMI V'/l4 ’ 


— _ —) 

v/l4 V14/ 


Orthogonal Complements 

Let 5 be a subset of an inner product space V. The orthogonal complement of S, denoted by S (read “5 
perp”) consists of those vectors in V that are orthogonal to every vector u G S: that is, 

S x = {n G V : (v, u) = 0 for every u £ 5} 
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In particular, for a given vector u in V. we have 
u 1 — {u € V : ( v , u) = 0} 

that is, u L consists of all vectors in V that are orthogonal to the given vector u. 

We show that .S' is a subspace of V. Clearly 0 G .S 1 , because 0 is orthogonal to every vector in V. Now 
suppose v, tveS 1 . Then, for any scalars a and h and any vector u £ S, we have 

(av + bw. u) — a(v, u) + b(w, u) = fl-0 + i>-0 = 0 

Thus, av + bw £ S 1 , and therefore S is a subspace of V. 

We state this result formally. 

PROPOSITION 7.3: Let S be a subset of a vector space V. Then S 1 is a subspace of V. 

Remark 1: Suppose u is a nonzero vector in R 3 . Then there is a geometrical description of u L . 
Specifically, u is the plane in R through the origin O and perpendicular to the vector u. This is shown 
in Fig. 7-2. 


z 



Figure 7-2 


Remark 2: Let W be the solution space of an m x n homogeneous system AX — 0, where A = [a J 

and X — [x,-]. Recall that W may be viewed as the kernel of the linear mapping A: R" ^ R" ! . Now we can 

give another interpretation of W using the notion of orthogonality. Specifically, each solution vector 
w = (x|, x 2 ,. .. ,x n ) is orthogonal to each row of A; hence, W is the orthogonal complement of the row 
space of A. 

EXAMPLE 7.8 Find a basis for the subspace u 1 of R 3 , where u = (1,3, —4). 

Note that u L consists of all vectors w = (x, v,z) such that ( u,w ) = 0, or x+ 3v — 4z = 0. The free variables 
are y and z. 

(1) Set y = 1, z = 0 to obtain the solution = (—3.1,0). 

(2) Set y = 0, z = 1 to obtain the solution w 2 = (4,0,1). 

The vectors w l and vv 2 form a basis for the solution space of the equation, and hence a basis for u L . 

Suppose W is a subspace of V. Then both W and W are subspaces of V. The next theorem, whose proof 
(Problem 7.28) requires results of later sections, is a basic result in linear algebra. 


THEOREM 7.4: Let W be a subspace of V. Then V is the direct sum of W and W L ; that is, 

v = w®w ± . 
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7.6 Orthogonal Sets and Bases 


Consider a set S — {u l ,u 2 ,,«,.} of nonzero vectors in an inner product space V. S is called orthogonal if 
each pair of vectors in S are orthogonal, and S is called orthonormal if S is orthogonal and each vector in S 
has unit length. That is, 


(i) Orthogonal: (u h Uj) = 0 

(ii) Orthonormal: (u h Uj) = 


for i /./' 

0 for i 7 ^ j 
1 for i = j 


Normalizing an orthogonal set S refers to the process of multiplying each vector in S by the reciprocal of its 
length in order to transform S into an orthonormal set of vectors. 

The following theorems apply. 


THEOREM 7.5: Suppose S is an orthogonal set of nonzero vectors. Then S is linearly independent. 

THEOREM 7.6: (Pythagoras) Suppose {u l ,u 2 ,..., u r } is an orthogonal set of vectors. Then 

II n2 || 112 || n 2 II ||2 

\\u l U 2 + • • • + U r || = 11W 1 11 + || U 2 1| 11 U r 11 


These theorems are proved in Problems 7.15 and 7.16, respectively. Here we prove the Pythagorean 
theorem in the special and familiar case for two vectors. Specifically, suppose (u, v) = 0. Then 

\\u + v\\ 2 — (u + v, u + v) = ( u , u) + 2(u, v) + (v, v) = (u , u) + ( v , v) — \\u\\ 2 + ||n|| 2 

which gives our result. 

EXAMPLE 7.9 

(a) Let E = {ei,e 2 ,e 3 } = {(1,0,0), (0,1,0), (0,0,1)} be the usual basis of Euclidean space R 3 . It is clear that 

( e i,e 2 ) = (ei,e 3 ) = (e 2 ,e 3 ) = 0 and (e u e x ) = (e 2 , e 2 ) = (e 3 , e 3 ) = 1 
Namely, E is an orthonormal basis of R 3 . More generally, the usual basis of R" is orthonormal for every n. 

(b) Let V = C[— 7 t, 7 t] be the vector space of continuous functions on the interval —n < t <n with inner product 
defined by (f,g) = jl n f(t)g(t) dt. Then the following is a classical example of an orthogonal set in V: 

{ 1, cos f, cos 2 1, cos 3/,..., sin t, sin 2 1, sin 3/,...} 

This orthogonal set plays a fundamental role in the theory of Fourier series. 

Orthogonal Basis and Linear Combinations, Fourier Coefficients 

Let S consist of the following three vectors in R 3 : 

«t = (1,2,1), u 2 = ( 2 ,1,-4), « 3 = (3,-2, 1 ) 

The reader can verify that the vectors are orthogonal; hence, they are linearly independent. Thus, S is an 
orthogonal basis of R 3 . 

Suppose we want to write v = (7,1,9) as a linear combination of u ,, u 2 , n 3 . First we set v as a linear 
combination of 3 using unknowns x 1 ,x 2 ,x 3 as follows: 

v = x l u l + x 2 u 2 + x 3 u 3 or (7,1,9) = jc x (1,2,1) +x 2 (2,1, —^4) +^ 3 (3, — 2,1) (*) 

We can proceed in two ways. 


METHOD 1: Expand (*) (as in Chapter 3) to obtain the system 

X| + 2x 2 + 3x 3 — 7, 2x l + x 2 — 2x 3 = 1, x 1 — 4x 2 + x 3 = 9 

Solve the system by Gaussian elimination to obtain x ] =3, x 2 — — 1, x 3 = 2. Thus, 
v = 3m, — u 2 + 2 u 3 . 
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METHOD 2: (This method uses the fact that the basis vectors are orthogonal, and the arithmetic is 
much simpler.) If we take the inner product of each side of (*) with respect to u h we get 

/ \ / \ / \ / \ ( + M ; ) 

{V, Hi) = {x x u 2 + x 2 u 2 + x 3 u 3 , Uj) or {v, uj = +•(«;, u t ) or x t = - -r 

\ u ii u i) 

Here two terms drop out, because u x ,u 2 ,u 2 are orthogonal. Accordingly, 

_ (v, Ul ) _ 7 + 2 + 9 _ 18 _ 3 x _ (v,u 2 ) _ 14 + 1 — 36 _ —21 _ . 

(zt 3 , u i) 1+4+1 6 (z/ 3 , m 3 ) 4+1 + 16 21 

(v, uf) 21 —2 + 9 28 ^ 

(z/ 3 , zz 3 ) 9 + 4+1 14 

Thus, again, we get v = 3 u x — u 2 + 2n 3 . 


The procedure in Method 2 is true in general. Namely, we have the following theorem (proved in 
Problem 7.17). 

THEOREM 7.7: Let {u x ,u 2 ,... ,u n } be an orthogonal basis of V. Then, for any v <E V. 

_ (+«l) , (v,u 2 ) ( V,u n ) 

V — - r u [ + - r u 2 + • • • + ~. r U„ 

(Ml, Ml) (M 2 ,M 2 ) (M„,M„) 


(+ M;> 

Remark: The scalar h = . -r is called the Fourier coefficient of v with respect to u h because it is 

(«<>«/) 

analogous to a coefficient in the Fourier series of a function. This scalar also has a geometric interpretation, 
which is discussed below. 


Projections 


Let V be an inner product space. Suppose w is a given nonzero vector in V and suppose v is another vector. 
We seek the “projection of v along w,” which, as indicated in Fig. 7-3(a), will be the multiple cw of w such 
that i/ — v — cw is orthogonal to w. This means 

(v — cw , w)—0 or (v,w) — c(w,w) — 0 or c = 

(w, Wf 


p 




Figure 7-3 

Accordingly, the projection of v along w is denoted and defined by 

• / \ (v,w) 

projfu, w) = cw = - rw 

{w , w) 

Such a scalar c is unique, and it is called the Fourier coefficient of v with respect to w or the component of 
v along w. 

The above notion is generalized as follows (see Problem 7.25). 
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THEOREM 7.8: Suppose w 1 , w 2 , . ■ ■, w r form an orthogonal set of nonzero vectors in V. Let v be any 
vector in V. Define 

v' — v - (c l w l + c 2 w 2 + -h c r w r ) 

where 

_ _ (v,W 2 ) _ (v,w r ) 

(h ’ 2 ,W 2 ) (w r ,W r ) 

Then t/ is orthogonal to w 1 ,h' 2 , ..., w r . 

Note that each c,- in the above theorem is the component (Fourier coefficient) of v along the given vv,. 

Remark: The notion of the projection of a vector v E V along a subspace W of V is defined as 
follows. By Theorem 7.4, V = W © VE X . Hence, v may be expressed uniquely in the form 

v = w + w’, where w E W and w 1 E W ± 

We define w to be the projection of v along IV. and denote it by proj( v. W ), as pictured in Fig. 7-3(b). In 
particular, if W = span(w 1 ,H’ 2 ,.... vty), where the w, form an orthogonal set, then 

proj(u, W) = c l w 1 + c 2 w 2 H-h c r w r 

Here c, is the component of v along vv,, as above. 


7.7 Gram-Schmidt Orthogonalization Process 


Suppose {uj, v 2 ,..., v n } is a basis of an inner product space V. One can use this basis to construct an 
orthogonal basis {w l5 xv 2 , ..., u n } of V as follows. Set 


VV 1 = Vi 


W 2 = v 2 - 

w 3 — V 2 ~ 


... 

(h-i,Wi) 1 

(v 2 , Wi) 

(vfi ,Wj) Wl 


(^3^2) 
(w 2 ,w 2 ) 2 


(v„,W 2 ) {v„,w n -\) 

W,Wt) (w 2 ,w 2 ) {w n _ t ,w n _ j) 

In other words, for k — 2,3,..., u, we define 

W* = V k ~ C kl w l ~ c kl^2 - Ck,k-l w k-l 

where c kj = (v k , w,)/(vv,. vv,) is the component of v k along w,-. By Theorem 7.8, each w k is orthogonal to 
the preceeding vv’s. Thus, w 1 , w 2 , ■ ■ ■, w n form an orthogonal basis for V as claimed. Normalizing each vv, 
will then yield an orthonormal basis for V. 

The above construction is known as the Gram-Schmidt orthogonalization process. The following 
remarks are in order. 

Remark 1: Each vector w k is a linear combination of v k and the preceding vv’s. Hence, one can 
easily show, by induction, that each w k is a linear combination of v { ,v 2 , ■ ■ ■ ,v n . 

Remark 2: Because taking multiples of vectors does not affect orthogonality, it may be simpler in 
hand calculations to clear fractions in any new w k , by multiplying w k by an appropriate scalar, before 
obtaining the next w i+1 . 
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Remark 3: Suppose u 1 ,u 2 ,...,u r are linearly independent, and so they form a basis for 
U = spanfn,). Applying the Gram-Schmidt orthogonalization process to the u ’s yields an orthogonal 
basis for JJ. 

The following theorems (proved in Problems 7.26 and 7.27) use the above algorithm and remarks. 


THEOREM 7 . 9 : Let {v 1 ,v 2 , ■ ■ ■, v„} be any basis of an inner product space V. Then there exists an 
orthonormal basis {u 1 ,u 2 ,... ,u n } of V such that the change-of-basis matrix from 
{u,} to {uj} is triangular; that is, for k = I...., n, 

11 k = a kl v i + a k2 v 2 + ■ ■ • + a kk v k 

THEOREM 7 . 10 : Suppose S = {w l ,w 2 ,..., w r } is an orthogonal basis for a subspace W of a vector 
space V. Then one may extend S to an orthogonal basis for V; that is, one may find 
vectors w r+l ,..., w n such that { w ,. w 2 ,..., w„} is an orthogonal basis for V. 


EXAMPLE 7.10 Apply the Gram-Schmidt orthogonalization process to find an orthogonal basis and then 
an orthonormal basis for the subspace U of R 4 spanned by 

Vi = (U, 1,1), w 2 = (1,2,4,5), — (ly 3, 4, 2) 

(1) First set w 1 = v 1 = (1,1,1,1). 

(2) Compute 


{v 2 ,Wl) 12 / 0 

V 2 - - V W 1 = v 2 - = (- 2 , 


1 , 1 , 2 ) 


Set w 2 = (-2, -1,1,2). 

(3) Compute 


v 3 


{v 3l w l 


-w. — 


(^3, W 9 ) 


(wi,^) (w 2 ,w 2 ) 


w 2 = 


(- 8 ), 


-w 1 — 




5 > 


17 

10 ’ 


13 7) 
10’5/ 


Clear fractions to obtain w 3 = (— 6 , —17, —13,14). 

Thus, w u w 2 ,w 3 form an orthogonal basis for U. Normalize these vectors to obtain an orthonormal basis 
{m 1 ,m 2 ,m 3 } of U. We have |||| _ =4, ||w 2 || 2 = 10, ||w 3 || 2 = 910, so 

Ml = 1(1,1,1,1), u 2 = — 1=(—2, —1,1,2), u 3 = _i= (16, -17,-13, 14) 


EXAMPLE 7.11 Let V be the vector space of polynomials/(f) with inner product (/, g) = [' , fit) git) dt. 
Apply the Gram-Schmidt orthogonalization process to {1 ,f, f 2 ,f 3 } to find an orthogonal basis 
{./cb./i -fi-fi } with integer coefficients for P 3 (f). 

Here we use the fact that, for r + i = n, 

{ 2/(n + 1) when ?7 is even 
0 when n is odd 


fdt = 

-1 n 


(1) First set / 0 = 1. 

(2) Compute t — ^ (1) = t — 0 = t. Set f x = t. 

(3) Compute 


t 2 - 


(* 2 , 1 ) 
( 1 , 1 ) 




1 

3 


Multiply by 3 to obtain/-, = 3f 2 = 1. 
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(4) Compute 


(1.1) U (r,r) lj 


(f,f) 1 J (3t 2 -l, 3f 2 -l) 1 J 


2 

= r 3 -0(1) -§(f) -0(3t 2 - l) = t 3 -ft 

3 

Multiply by 5 to obtain / 3 = 5t 3 — 3t. 

Thus, {1, t, 3t 2 — 1, 5t 3 — 3f} is the required orthogonal basis. 

Remark: Normalizing the polynomials in Example 7.11 so that p( 1) = 1 yields the polynomials 

1, t, 1(3 1 2 - 1), i(5t 3 -3t) 

These are the first four Legendre polynomials, which appear in the study of differential equations. 


7.8 Orthogonal and Positive Definite Matrices 

This section discusses two types of matrices that are closely related to real inner product spaces V. Here 
vectors in R" will be represented by column vectors. Thus, ( u, v) = u T v denotes the inner product in 
Euclidean space R". 


Orthogonal Matrices 

A real matrix P is orthogonal if P is nonsingular and P 1 = P T , or, in other words, if PP T = P T P = I. 
First we recall (Theorem 2.6) an important characterization of such matrices. 

THEOREM 7.11: Let P be a real matrix. Then the following are equivalent: (a) P is orthogonal; (b) the 
rows of P form an orthonormal set; (c) the columns of P form an orthonormal set. 
(This theorem is true only using the usual inner product on R". It is not true if R" is given any other 
inner product.) 

EXAMPLE 7.12 

"l/v/3 1/V3 1/V3' 

(a) Let P = 0 1/V2 —\/y/2 ■ The rows of P are orthogonal to each other and are unit vectors. Thus P 

_2/ v / 6 -l/v/6 -l/\/6_ 

is an orthogonal matrix. 

(b) Let P be a 2 x 2 orthogonal matrix. Then, for some real number 0, we have 


cos 6 

sind 


r, TcOSO 

sind 

— sind 

cos 9 

or 

P= • a 

smd 

— cos 6 


The following two theorems (proved in Problems 7.37 and 7.38) show important relationships 
between orthogonal matrices and orthonormal bases of a real inner product space V. 

THEOREM 7 . 12 : Suppose E = {e,} and E' = {<?'} are orthonormal bases of V. Let P be the change- 
of-basis matrix from the basis E to the basis E'. Then P is orthogonal. 

THEOREM 7 . 13 : Let {e { ,..., e n } be an orthonormal basis of an inner product space V. Let P = [ay] 
be an orthogonal matrix. Then the following n vectors form an orthonormal basis 
for V: 

e'i = a-ifii + a 2i e 2 +-b a ni e n , i = 1,2,..., n 
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Positive Definite Matrices 

Let A be a real symmetric matrix; that is, A 7 = A. Then A is said to be positive definite if, for every 
nonzero vector u in R", 

( u,Au) = u t Au > 0 

Algorithms to decide whether or not a matrix A is positive definite will be given in Chapter 13. However, 
for 2 x 2 matrices, we have simple criteria that we state formally in the following theorem (proved in 
Problem 7.43). 

d b ci b 

THEOREM 7.14: A 2x2 real symmetric matrix A = = is positive definite 

if and only if the diagonal entries a and d are positive and the determinant 
\A\ = ad — be = ad — b 2 is positive. 

EXAMPLE 7.13 Consider the following symmetric matrices: 



A is not positive definite, because |A| = 4 — 9 = —5 is negative. B is not positive definite, because the diagonal 
entry —3 is negative. However, C is positive definite, because the diagonal entries 1 and 5 are positive, and the 
determinant |C| = 5 — 4 = 1 is also positive. 

The following theorem (proved in Problem 7.44) holds. 

THEOREM 7 . 15 : Let A be a real positive definite matrix. Then the function ( u , v) = u T Av is an inner 
product on R". 

Matrix Representation of an Inner Product (Optional) 

Theorem 7.15 says that every positive definite matrix A determines an inner product on R". This 
subsection may be viewed as giving the converse of this result. 

Let V be a real inner product space with basis S = {m 1 ,m 2 , . .., u„}. The matrix 

A = [Oj,], where a« = (w ; ,n,) 

is called the matrix representation of the inner product on V relative to the basis S. 

Observe that A is symmetric, because the inner product is symmetric; that is, (u r u.) = (w-,«,). Also, A 
depends on both the inner product on V and the basis S for V. Moreover, if S is an orthogonal basis, then A 
is diagonal, and if S is an orthonormal basis, then A is the identity matrix. 

EXAMPLE 7.14 The vectors U\ = (1,1,0), u 2 = (1,2,3), n 3 = (1,3,5) form a basis S for Euclidean 
space R 3 . Find the matrix A that represents the inner product in R 3 relative to this basis S. 

First compute each (u h uf/ to obtain 

(u 1 ,u l ) = 1 + 1 + 0 = 2, (u l ,u 2 ) = 1 + 2 + 0 = 3, (mj, m 3 ) = 1 +3 + 0 = 4 

(w 2 , n 2 ) — 1 T 4 T 9 — 14, (w 2 , w 3 ) — 1 T 6 T 15 — 22, (w 3 , n 3 ) = 1 T 9 T 25 — 35 

'2 3 4' 

Then A = 3 14 22 . As expected, A is symmetric. 

4 22 35 

The following theorems (proved in Problems 7.45 and 7.46, respectively) hold. 

THEOREM 7 . 16 : Let A be the matrix representation of an inner product relative to basis S for V. Then, 
for any vectors u, v G V, we have 

(u,v) = [M] r A[n] 

where [u] and [?;] denote the (column) coordinate vectors relative to the basis S. 
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THEOREM 7.17: Let A be the matrix representation of any inner product on V. Then A is a positive 
definite matrix. 

7.9 Complex Inner Product Spaces 


This section considers vector spaces over the complex field C. First we recall some properties of the 
complex numbers (Section 1.7), especially the relations between a complex number z = a + hi. where 
a, b E R, and its complex conjugate z — a — hi: 

zz — a 2 + b 2 , |z| = \/ a 2 + b 2 , z l +z 2 = zi + z 2 zizo — z^, z = z 

Also, z is real if and only if z — z. 

The following definition applies. 

DEFINITION: Let V be a vector space over C. Suppose to each pair of vectors, u, v E V there is 

assigned a complex number, denoted by (u, v). This function is called a ( complex) inner 
product on V if it satisfies the following axioms: 

[7f] ( Linear Property ) (au l + bu 2 , v) = a(u l , v) + b(u 2 , v) 

[7f] ( Conjugate Symmetric Property) (u , v) = (v, u) 

[/f] ( Positive Definite Property) ( u , u) > 0; and ( u , u) = 0 if and only if it = 0. 

The vector space V over C with an inner product is called a ( complex) inner product space. 

Observe that a complex inner product differs from the real case only in the second axiom [7f ]. 
Axiom [7f] (Linear Property) is equivalent to the two conditions: 

(a) (m[ + u 2 , v) = (mj, v) + (u 2 , v), (b) (ku,v)=k(u,v) 

On the other hand, applying [7f] and [7f], we obtain 

( u, kv) = (kv, u) = k(v, u) — k(v, u) — k(u, v) 

That is, we must take the conjugate of a complex number when it is taken out of the second position of a 
complex inner product. In fact (Problem 7.47), the inner product is conjugate linear in the second position; 
that is, 

( u , av l + bv 2 ) — d(u, v { ) + b(u , v 2 ) 

Combining linear in the first position and conjugate linear in the second position, we obtain, by induction, 

= E a i b j( u v v j) 

ij 

The following remarks are in order. 

Remark 1: Axiom [7f] by itself implies that (0,0) = (On, 0) = 0(v, 0) = 0. Accordingly, [7f], [7f], 
and [7f] are equivalent to [7f], [7f], and the following axiom: 

[If'] If u 7 ^ 0, then (u, u) > 0. 

That is, a function satisfying [7f], [7f], and [if is a (complex) inner product on V. 

Remark 2: By [7f], (. u,u) = (u,u). Thus, (u,u) must be real. By [If], (u,u) must be nonnegative, 
and hence, its positive real square root exists. As with real inner product spaces, we define ||m|| = \J( 11 . u) 
to be the norm or length of u. 

Remark 3: In addition to the norm, we define the notions of orthogonality, orthogonal complement, 
and orthogonal and orthonormal sets as before. In fact, the definitions of distance and Fourier coefficient 
and projections are the same as in the real case. 


E a i u n T, b i v j 
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EXAMPLE 7.15 (Complex Euclidean Space C"). Let V = C", and let u = (z ; ) and v = (w,) be vectors in 
C". Then 

(«> V ) = I2 ZiM = + Z 2 mT + • ■ ■ + Z n W- n 

k 

is an inner product on V, called the usual or standard inner product on C". V with this inner product is called 
Complex Euclidean Space. We assume this inner product on C" unless otherwise stated or implied. Assuming u and v 
are column vectors, the above inner product may be defined by 

( u , v) — u T v 

where, as with matrices, v means the conjugate of each element of v. If u and v are real, we have W\ = w ; . In this case, 
the inner product reduced to the analogous one on R". 

EXAMPLE 7.16 

(a) Let V be the vector space of complex continuous functions on the (real) interval a < t < b. Then the following is 
the usual inner product on V: 

( f,g)= f(t)g(t)dt 

a 

(b) Let U be the vector space of m x n matrices over C. Suppose A = (z,y) and B = (wy) are elements of U. Then the 
following is the usual inner product on U : 

m n 

(A, B) = tr (B h A) = J2 WjjZij 
i=l 7=1 

As usual, B H = B t - that is, B H is the conjugate transpose of B. 

The following is a list of theorems for complex inner product spaces that are analogous to those for the 
real case. Here a Hermitian matrix A (i.e., one where A" = A T = A) plays the same role that a symmetric 
matrix A (i.e., one where A T = A) plays in the real case. (Theorem 7.18 is proved in Problem 7.50.) 

THEOREM 7.18: (Cauchy-Schwarz) Let V be a complex inner product space. Then 

| (u, u)| < IHHHI 


THEOREM 7.19: Let W be a subspace of a complex inner product space V. Then V = W © W . 


THEOREM 7.20: 


Suppose { u ,, u 2 ,} is a basis for a complex inner product space V. Then, for 
any v € V, 


= (P.“l) + ( V Al 2 ) Ui + . . . + U n) 

1 ( U 2 ,U 2 ) 2 («„,«„) 


u 


n 


THEOREM 7.21: Suppose {u { , u 2 , ■ ■ ■, u n } is a basis for a complex inner product space V. Let A = [aA 
be the complex matrix defined by a lf = (u h Uj). Then, for any u. v G V. 

(u, v) = [»] r A[w] 

where [u] and [t>] are the coordinate column vectors in the given basis {w,}. ( Remark : 
This matrix A is said to represent the inner product on V.) 

THEOREM 7.22: Let A be a Hermitian matrix (i.e., A H = A T = A) such that X T AX is real and positive 
for every nonzero vector X £ C". Then (it. v) = it T Av is an inner product on C". 

THEOREM 7.23: Let A be the matrix that represents an inner product on V. Then A is Hermitian, and 
X T AX is real and positive for any nonzero vector in C”. 
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7.10 Normed Vector Spaces (Optional) 

We begin with a definition. 

DEFINITION: Let V be a real or complex vector space. Suppose to each v £ V there is assigned a real 

number, denoted by ||u||. This function || • || is called a norm on V if it satisfies the 
following axioms: 

[Nj] ||u|| > 0; and ||u|| = 0 if and only if v = 0. 

[N 2 ] IM = I*IIMI- 

[N 3 ] ||u + u|| < ||u|| + ||n||. 

A vector space V with a norm is called a normed vector space. 

Suppose V is a normed vector space. The distance between two vectors u and v in V is denoted and 
defined by 

d(u, v) = ||u — u|| 

The following theorem (proved in Problem 7.56) is the main reason why d(u,v ) is called the distance 
between u and v. 

THEOREM 7.24: Let V be a normed vector space. Then the function d(u, v ) = \\u — u|| satisfies the 
following three axioms of a metric space: 

[M,] d(u,v)> 0; and d(u, v) = 0 if and only if u = v. 

[M 2 ] d(u, v) = d(v,u). 

[M 3 ] d(u, v) < d(u, w) + d(w, v). 


Normed Vector Spaces and Inner Product Spaces 

Suppose V is an inner product space. Recall that the norm of a vector v in V is defined by 



One can prove (Theorem 7.2) that this norm satisfies [Nj, [N 2 ], and [N 3 ]. Thus, every inner product space 
V is a normed vector space. On the other hand, there may be norms on a vector space V that do not come 
from an inner product on V, as shown below. 


Norms on R n and C" 

The following define three important norms on R" and C": 

IIK,-.., a,.) Hoc = max (kl) 

)(«!,..., «„) II i = l fl i I + l a 2 l + ■ ■ ■ + Kl 

)(«!,... ,a n )|| 2 = yViT + \ a i\~ + ■ • • + I 

(Note that subscripts are used to distinguish between the three norms.) The norms || ■ |. x , || - |,, and || ■ || 2 
are called the infinity-norm, one-norm, and two-norm, respectively. Observe that || • || 0 is the norm on R" 
(respectively, C") induced by the usual inner product on R" (respectively, C"). We will let <7. x , d x , d 2 
denote the corresponding distance functions. 

EXAMPLE 7.17 Consider vectors u = (1, —5,3) and v = (4,2, —3) in R 3 . 

(a) The infinity norm chooses the maximum of the absolute values of the components. Hence, 

IMIoo = 5 and IMloo = 4 
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(b) The one-norm adds the absolute values of the components. Thus, 

IHIj = 1+5 + 3 = 9 and |M| I =4 + 2 + 3 = 9 

(c) The two-norm is equal to the square root of the sum of the squares of the components (i.e., the norm induced by 
the usual inner product on R 3 ). Thus, 

IMI2 = + 25 + 9 — v/35 and ||w|| 2 = v/l6~+~4+~9 = \/29 

(d) Because u — v = (1 — 4, —5 — 2, 3 + 3) = (—3, —7, 6), we have 

d^u, v)=7, d x {u, v) = 3 + 7 + 6 = 16, d 2 (u, v) = \/9 + 49 + 36 = \/94 

EXAMPLE 7.18 Consider the Cartesian plane R 2 shown in Fig. 7-4. 

(a) Let D x be the set of points u = (x,y) in R 2 such that ||w|| 2 = 1. Then D x consists of the points (x,y) such that 
\\uW 2 = x z + y 2 = 1. Thus, D x is the unit circle, as shown in Fig. 7-4. 



Figure 7-4 


(b) Let D 2 be the set of points u = (x,y) in R 2 such that ||m||j = 1. Then D x consists of the points (jt,y) such that 

Hull, = |jr| + y| = 1. Thus, D 2 is the diamond inside the unit circle, as shown in Fig. 7-4. 

(c) Let D 3 be the set of points u = (x, y) in R 2 such that HwH^ = 1. Then D 3 consists of the points (x, y) such that 

ll M lloci = max (l-*l> bl) = 1- Thus, D 3 is the square circumscribing the unit circle, as shown in Fig. 7-4. 


Norms on C[a,b] 

Consider the vector space V — C[a, b] of real continuous functions on the interval a < t < b. Recall that 
the following defines an inner product on V: 


(f,g) 


.b 


f(t)g(t) dt 


Accordingly, the above inner product defines the following norm on V = C[a, b] (which is analogous to 
the || ■ ||, norm on R"): 


ll/ll 2 = 



[/(0] : dt 
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The following define the other norms on V — C[a,b ]: 


1/0)1 dt 


and 


- max(|/0)|) 


There are geometrical descriptions of these two norms and their cotTesponding distance functions, which 
are described below. 

The first norm is pictured in Fig. 7-5. Here 


|| /|| j = area between the function | f\ and the /-axis 
d l {f,g) — area between the functions / and g 



Figure 7-5 



This norm is analogous to the norm || ■ ||[ on R". 


The second norm is pictured in Fig. 7-6. Here 

II / II oo = maximum distance between / and the /-axis 
d. x ( /, g) = maximum distance between / and g 

This norm is analogous to the norms || ■ | ^ on R". 




Figure 7-6 


SOLVED PROBLEMS 


Inner Products 

7.1. Expand: 

(a) (5 m[ + 8 u 2 , 6v { — lv 2 ), 

(b) (3 u + 5v, 4 u — 6v), 

(c) ||2n — 3?/|| 2 

Use linearity in both positions and, when possible, symmetry, (u. v) = (v. u). 
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(a) Take the inner product of each term on the left with each term on the right: 

(5u x + 8m 2 , 6v l — 7 v 2 ) = (5ui,6vi) + (5u l ,—7v 2 ) + (8M 2 ,6tq) + (8m 2 ,—7v 2 ) 

= 30(m 1 ,d 1 ) - 35(m 1 , v 2 ) + 48 (m 2 , tq) - 56 (u 2 ,v 2 ) 

[Remark: Observe the similarity between the above expansion and the expansion ( 5a-Sb)(6c-7d ) in 
ordinary algebra.] 

(b) (3m + 5v , 4m — 6v) = 12 (m, m) — 18(m, v) + 20{v, u) — 30(v, v) 

= 12 (m, m) + 2 (m, u) — 30(+ t>) 

(c) ||2 m — 3t;||“ = (2m — 3+ 2m — 3v) = 4 (m, m) — 6(m, u) — 6(v, u) + 9(v , r>) 

= 411zy11 2 — 12(m, v) + 9||w|| 2 

7.2. Consider vectors u — (1,2,4), v=(2,— 3,5), w=(4,2,— 3) in R 3 . Find 

(a) m • v, (b) u-w, (c) v-w, (d) (n + v) ■ w, (e) |w||, (f) ||v||. 

(a) Multiply corresponding components and add to get u ■ v = 2 — 6 + 20= 16. 

(b) m • w = 4 + 4 — 12 = —4. 

(c) d • w = 8 — 6 — 15 = —13. 

(d) First find u + v = (3, — 1,9). Then (u + v) ■ w = 12 —2 —27 = —17. Alternatively, using [Ij], 

(u + v) ■ w = u ■ w + v ■ w = —4 — 13 = —17. 

(e) First find ||m|| 2 by squaring the components of u and adding: 

||w|| 2 = l 2 + 2 2 + 4 2 = 1 + 4 + 16 = 21, and so ||m|| = \/21 

(f) || t) || 2 = 4 + 9 + 25 = 38, and so ||v|| = ^38. 

7.3. Verify that the following defines an inner product in R 2 : 

(m, v) = xm - x x y 2 - v 2 Vi + 3x 2 y 2 , where u = (x u x 2 ), v = ( y u y 2 ) 

We argue via matrices. We can write (u. v ) in matrix notation as follows: 

{u,v) = u T Av = [xi,x 2 \ | ^ (/ 

Because A is real and symmetric, we need only show that A is positive definite. The diagonal elements 1 and 3 
are positive, and the determinant ||A|| = 3 — 1 = 2 is positive. Thus, by Theorem 7.14, A is positive definite. 
Accordingly, by Theorem 7.15, (m, v) is an inner product. 

7.4. Consider the vectors u = (1,5) and v = (3,4) in R 2 . Find 

(a) (u. v) with respect to the usual inner product in R 2 . 

(b) (u, v) with respect to the inner product in R 2 in Problem 7.3. 

(c) ||?:j using the usual inner product in R 2 . 

(d) 1 1 '| using the inner product in R 2 in Problem 7.3. 

(a) (m, v) = 3 + 20 = 23. 

(b) (m, v) = 1 ■ 3 - 1 ■ 4 - 5 • 3 + 3 • 5 • 4 = 3 - 4 - 15 + 60 = 44. 

(c) ||u|| 2 = (v , v ) = ((3,4), (3,4)) = 9 + 16 = 25: hence, |tt|| = 5. 

(d) IMI 2 = (v,v) = ((3,4), (3,4)) = 9 - 12- 12 + 48 = 33; hence, ||v|| = v/33. 

7.5. Consider the following polynomials in P(f) with the inner product (/. g) = J ( j dt: 

f{t) = t + 2, g{t) = 3t — 2, h{t) = t 2 — 2t — 3 

(a) Find (f,g) and (/,/?). 

(b) Find ||/|| and ||g||. 

(c) Normalize / and g. 
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(a) Integrate as follows: 


(b) 


(f,g) = 


</,*> = 


(t + 2)(3t-2) dt = 


(3 1 2 + At — A) dt = f 3 + It 2 - At 


It 2 


= -1 


(t + 2)(t 2 - 2t - 3) dt = - - — - 6t 


(f,f) = Jo' (t+ 2 )(t + 2) dt = 


— 19. 
3 ’ 


4 2 

hence, 


(g,g) = 


(3t — 2)(3t — 2) = 1; 


hence, 


_31 

= sfi = 1^57 

M = v / T= i 


(c) Because ||/|| = ^\/57 and g is already a unit vector, we have 


f \\f\\ f v / 57 (? + 2) 


and 


g = g = 3t-2 


7.6. Find cos 6 where 6 is the angle between: 

(a) u — (1,3, —5,4) and v — (2, —3,4,1) in R 4 , 


(b) A — 


9 8 7 
6 5 4 


and B — 


1 2 3 
4 5 6 


, where (A, B) = tr (B T A). 


Use cos 9 = 


{u , v) 

lullllvll 


(a) Compute: 

(u,v) = 2 - 9 - 20 + 4 = -23, 
Thus, 


\u\\ = 1+9 + 25+16 = 51, 

n -23 -23 

cos 6 = 


V51V30 3\/l7() 


+|| z = 4 + 9+ 16+ 1 = 30 


(b) Use (A, B) = tr (B T A) = Ym=i Ylj= t a ij^ij’ l be sum of the products of corresponding entries. 

(A,B) = 9 + 16 + 21 + 24 + 25 + 24 = 119 
Use ||A|| 2 = (A, A) = Yl?=i X) ; "_i a lj ■ the sum of the squares of all the elements of A. 

||A|| 2 = (A,A) = 9 2 + 8 2 + 7 2 + 6 2 + 5 2 + 4 2 = 271, and so ||A|| = y/fll 

||B|| 2 = (B,B) = l 2 + 2 2 + 3 2 + 4 2 + 5 2 + 6 2 = 91, and so ||B|| = v/9l 

T U 119 

Thus, cos 6 = . - - 

VfnVn 


7.7. Verify each of the following: 

(a) Parallelogram Law (Fig. 7-7): \\u + u|| 2 + \\u — w|| 2 = 211w11 2 + 2||u|| 2 . 

(b) Polar form for ( u , v ) (which shows the inner product can be obtained from the norm function): 

(u,v) = \{\\U+ vf - \\u- t:|| 2 ). 

Expand as follows to obtain 

|| u + u|| 2 = {u + v, u + v) = ||w|| 2 + 2 (u, v ) + ||v|| 2 (1) 

||u - v\\ 2 = (u - v , u - v) = ||w|| 2 - 2(u , w) + ||u|| 2 (2) 

Add (1) and (2) to get the Parallelogram Law (a). Subtract (2) from (1) to obtain 

||m + u|| 2 - ||u - v\\ 2 = 4(u, v) 

Divide by 4 to obtain the (real) polar form (b). 
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Figure 7-7 


7.8. Prove Theorem 7.1 (Cauchy-Schwarz): For u and v in a real inner product space V, 

(u,u ) 2 < (u,u)(v,v) or |(m, i?)| < ||u||||v||. 

For any real number t, 

{tu + v, tu + v) = t 2 (u, u) + 2 t(u, v) + (v, v) = t 2 \\u\\ 2 + 2 t(u, v) + ||t)|| 2 
Let a = ||h||", b = 2(u, v), c = ||t>|| 2 . Because \\tu + d|| 2 > 0, we have 

at 2 + bt + c > 0 

for every value of t. This means that the quadratic polynomial cannot have two real roots, which implies that 
b 2 — 4 ac < 0 or b 2 < 4ac. Thus, 

4{u,v) 2 < 4||m|| 2 |H| 2 

Dividing by 4 gives our result. 

7.9. Prove Theorem 7.2: The norm in an inner product space V satisfies 

(a) [N|] ||u|| > 0; and ||r;|| = 0 if and only if v — 0. 

(b) [N 2 ] \\kv\\ = \k\\\v\\. 

(c) [N 3 ] ||u + r;|| < ||m|| + ||r;||. 

(a) If v ^ 0, then (v,v) > 0, and hence, ||u|| = (v, v) >0. If v = 0, then (0,0) = 0. Consequently, 

||0|| = = 0. Thus, [Nj] is true. 

(b) We have ||fc?;||~ = (kv,kv) = k 2 (v , v) = A: 2 ||ti||“. Taking the square root of both sides gives [N 2 ]- 

(c) Using the Cauchy-Schwarz inequality, we obtain 

\\u + v\\ 2 = (u + v, u + v) = (u, u) + (u, v ) + (u, v) + {v, v) 

< ||»|| 2 +2||m||||v|| + ll^ll 2 = (||»|| + IMI) 2 
Taking the square root of both sides yields [N 3 ], 

Orthogonality, Orthonormal Complements, Orthogonal Sets 

7.10. Find k so that u — (1,2, k,3) and v — ( 3,k,l , —5) in R 4 are orthogonal. 

First find 

(u, v ) = (1,2, it, 3) • (3,£,7, —5) =3 + 2k + lk- 15 = 9k- 12 
Then set (u. v) = 9k — 12 = 0 to obtain k = |. 

7.11. Let VP be the subspace of R 5 spanned by u— (1,2,3,—1,2) and v — (2,4,7,2, — 1). Find a 
basis of the orthogonal complement VP X of W. 

We seek all vectors vv = (x, y, z, s, t) such that 

(w,u) = x + 2y + 3z — s + 2t = 0 
(w , v) = 2x + 4y + lz + 2s — t = 0 
Eliminating x from the second equation, we find the equivalent system 

x + 2y + 3z — s + 2t = 0 
z + 4s — 5t = 0 
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The free variables are y, s, and t. Therefore, 

(1) Set y = — 1, s = 0, t = 0 to obtain the solution w x = (2, —1,0,0,0). 

(2) Set y = 0, s = 1, t = 0 to find the solution w 2 = (13,0, —4.1,0). 

(3) Set y = 0, s = 0, t = 1 to obtain the solution w 3 = (—17,0,5,0,1). 

The set {w 1 ,vv 2 , w 3 } is a basis of VPL 

7.12. Let w — (1,2,3,1) be a vector in R 4 . Find an orthogonal basis for ir . 

Find a nonzero solution of jt + 2y + 3z + r = 0, say = (0,0,1, —3). Now find a nonzero solution of 
the system 

x + 2y + 3z +1 = 0, z- 3t = 0 
say v 2 = (0, —5, 3,1). Last, find a nonzero solution of the system 

x + 2y + 3z + t = 0, -5 y + 3z + t = 0, z — 3t = 0 

say v 3 = (—14,2,3,1). Thus, v x , v 2 , v 3 form an orthogonal basis for w 4 -. 

7.13. Let S consist of the following vectors in R 4 : 

«i = (M,0,-1), u 2 — (1,2,1,3), w 3 = (1,1, 9,2), u 4 = (16,-13,1,3) 

(a) Show that S is orthogonal and a basis of R 4 . 

(b) Find the coordinates of an arbitrary vector v = (a, b. c, d) in R 4 relative to the basis S. 

(a) Compute 

W) • w 2 — 1 + 2 + 0 — 3 = 0, • u 2 = 1 + 1+0 — 2 = 0, — 16 — 13 + 0 — 3 = 0 

m 2‘ m 3 = 1 + 2 — 9 + 6 = 0, u 2 ■ u 4 = 16 — 26 + 1 + 9 = 0, u 3 • u 4 = 16 — 13 — 9 + 6 = 0 

Thus, S is orthogonal, and S is linearly independent. Accordingly, S is a basis for R 4 because any four 
linearly independent vectors form a basis of R 4 . 

(b) Because S is orthogonal, we need only find the Fourier coefficients of v with respect to the basis vectors, 
as in Theorem 7.7. Thus, 


(v,u x ) 

a + b — d 

jr _ (v,u 3 ) _ 

a + b — 9c + 2d 

(«l,«l) 

3 

3 (m 3 , m 3 ) 

87 

(v, u 2 ) 

a + 2b + c + 3d 

J r _ (v,u 4 ) _ 

16fl — 137> + c + 3d 

{• « 2 ,« 2 ) 

15 

(u<\i W 4 ) 

435 


are the coordinates of v with respect to the basis S. 

7.14. Suppose S, S { , S 2 are the subsets of V. Prove the following (where S 1 means (S 1 ) 1 ): 

(a) 

(b) If 5! C S 2 , then ,$) C .S', 1 . 

(c) S 1 - = span (S+ 

(a) Let w G S. Then (vv, v) = 0 for every v G S’ 4 -; hence, vv G 5 ±_L . Accordingly, S C 5 4 " 1 . 

(b) Let w G S 2 . Then (vv, v) = 0 for every v G S 2 . Because Si C S 2 , (vv, v) = 0 for every v = S x . Thus, 
vv G Sf, and hence, S 2 Q Sf. 

(c) Because S C span(S), part (b) gives us span(5') ± C S' 4 -. Suppose u G S L and v G span(5). Then there 
exist w’i , w 2 ,..., w k in S such that v = a x w x + a 2 w 2 + • • • + a k w k . Then, using u G S 4 -, we have 

(u, v ) = ( u , fljW] + a 2 w 2 +- 1 - a k w k ) = a x (u,Wi) + a 2 {u,w 2 ) + ■ ■ • + a k {u,w k ) 

— u 3 ( 0) + a 2 (0) + • • ■ + a k (Q) — b 

Thus, u G span(5') ± . Accordingly, 5 1 C span(5) 4 . Both inclusions give 5 J_ = span(S , ) ± . 

7.15. Prove Theorem 7.5: Suppose S is an orthogonal set of nonzero vectors. Then S is linearly 
independent. 
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Suppose S = {mj , m 2 , ..., u,.} and suppose 

a x u x + a 2 u 2 + • • • + a r u r = 0 (1) 

Taking the inner product of (1) with u x , we get 

0 = (0, u x ) = (a x u x + a 2 u 2 + • • • + a r ii r , u x ) 

= a x {u x , u x ) + a 2 (u 2 , u x ) + • • • + a r (u r , u x ) 

— u x (u Xl u x ) T fl 2 • 0 T * * * T ci r ' 0 — ci x {u Xl u x ) 

Because u x ^ 0, we have (u x , u x ) ^ 0. Thus, a x = 0. Similarly, for i = 2,..., r, taking the inner product of 
(1) with m ; , 

0 = (0, m ; ) = («]«! + ••• + a r ii r , u t ) 

= a x {u Xl u,) H-b «,•(«;, M/) H-fa r K>«i) = «/(«/.«/) 

But (m,-, m ; ) ^ 0, and hence, every a ; = 0. Thus, S is linearly independent. 


7.16. Prove Theorem 7.6 (Pythagoras): Suppose {u x ,u 2 , ■ ■ ■, u r } is an orthogonal set of vectors. Then 

ik + u i —p u r \\ 2 = ik ir + iKir ■+— + iKir 

Expanding the inner product, we have 

H«1 + U 2 + • • • + M r 11 = (mi T" U 2 + • • • + U r: U x +U 2 -\ -h M r ) 

= («i, u x ) + (m 2 , m 2 ) + • • ■ + (n r , m,.) + M j) 

The theorem follows from the fact that [u h m,) = ||m ; ||~ and (u x , u.) = 0 for i ^ j. 


7.17. Prove Theorem 7.7: Let {u x ,u 2 , ■ ■ ■, u n } be an orthogonal basis of V. Then for any v £ V. 


Suppose v 


v = 


Ki[ 

(u x ,u x ) 


Ui + 


(v,u 2 ) 

(u 2 ,m 2 ) 


u 2 + ■ ■ ■ + 


(*>>«„} „ 

/ \^n 


fyuj + A: 2 m 2 + • • • + k n u n . Taking the inner product of both sides with u x yields 


K M : ) = (k x U 2 + k 2 U 2 -f-1- fc„M„, Uj) 

= k x (u Xl u x ) + k 2 (u 2 ,u x ) H-f «t) 

= k x {u x ,u x ) T k*, • 0 T ••• T k n *0 = k x (u Xl u x ) 


Thus, k x = -j— —Similarly, for i = 2,..., n, 

(Mi, Mj) 

K u x ) = (k x Uj + k 2 u 2 H-h k n u„, ui) 

= k x {u x ,Ui) + k 2 (u 2 ,Ui) H- \-k n (u n ,Ui) 

= • 0 + • • • + ki(uj, u t ) H-+ k n ■ 0 = k, (Mj, m,) 

Thus, = j 1 ' U, \ . Substituting for h in the equation v = k x u x + • • • + k„u„, we obtain the desired result. 

(«,•«;) 

7.18. Suppose E — {e x ,e 2 ,..., e n } is an orthonormal basis of V. Prove 

(a) For any «6l(we have u — ( u , + (m, e 2 )e 2 + ■•• + («, e„)e„. 

(b) (a^, + • • • + b x e x + • ■ ■ + b n e n ) = a x b x + a 2 b 2 + ■ ■ ■ + a n b n . 

(c) For any u, v € V, we have (u, v) = (u,e x )(v,e x ) + ■ ■ ■ + (u,e n )(v,e n ). 

(a) Suppose u = k x e x + k 2 e 2 + • • • + k n e n . Taking the inner product of u with e x , 


(u,e x ) = (k x e x +k 2 e 2 -t -f k„e n , e x ) 

= h{e\,e x ) +A: 2 (e 2 ,ei) + • • • + k„(e n , e x ) 
= ^t(l) + k 2 (0) H- 1 - k n ( 0 ) = k x 
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Similarly, for i = 2,..., n, 

(u, e,) = (Mi + ''' + kfii + • • • + k n e n , e,) 

= h{e x , e,■) + ••• + £,(e,-, e,-) + • • • + k„(e„, e,) 

= ^i(O) H- 1 - kj( 1 ) H-f k n ( 0 ) = k t 

Substituting (t/, e ; ) for in the equation « = k x e x + • • • + k n e n , we obtain the desired result. 

(b) We have 

E a i e v E b j e j / = E a i b j( e i, e j ) = E M;( e ;>G') + E M,( e ;> e j) 

i=i j=t / y=i <=i 

But (e ; , gy) = 0 for i ^ j, and (e,-,^) = 1 for i = j. Hence, as required, 

n n \ n 

E E M ) = E a i b i = Mi + a 2 b 2 + • • • + a A 

'=1 j= 1 / i=l 

(c) By part (a), we have 

M = («,e 1 )e 1 d-F (w, e n )e„ and v=(v,e 1 )e l -\ - \-{v,e n )e n 

Thus, by part (b), 

(u, v) = (u,e x )(v,e x ) + ( u,e 2 )(v,e 2 ) H-h (u,e n )(v,e„) 


Projections, Gram-Schmidt Algorithm, Applications 

7.19. Suppose w 7 ^ 0. Let v be any vector in V. Show that 

(v,w) (v,w) 


c = 


(w, w) 


vv 


is the unique scalar such that v' = v — cw is orthogonal to vv. 

In order for v' to be orthogonal to w we must have 

( v—cw , w) = 0 or (v, w) — c(w, w) = 0 or (v,w) = c{w,w) 

T , {v,w) (v,w) 

Thus, cj -Conversely, suppose c = -j --. ihen 

(w,w) (w,w) 

(v — cw, w) = (v, w) — c(w, w) = (v, w) — —- (w, w) = 0 

(w, w) 

7.20. Find the Fourier coefficient c and the projection of v— (1,—2,3,—4) along vv = (1,2,1,2) in R 4 . 

Compute (v, w) = 1 - 4 + 3 - 8 = —8 and ||h ’|| 2 = 1 + 4 + 1 + 
c = — jg = — | and proj(i>, w) = cw = (- 

7.21. Consider the subspace U of R 4 spanned by the vectors: 


= 10. Then 

4 _ 8 _ 4 _ 8 \ 

5 ’ 5 > 5 ’ 5 > 


wi = (1,1,1,1), v 2 = (1,1,2,4), v 3 — (1,2, —4, —3) 

Find (a) an orthogonal basis of U ; (b) an orthonormal basis of U. 

(a) Use the Gram-Schmidt algorithm. Begin by setting w I =w = ( 1,1,1,1). Next find 

^“fe 1 ^y Wi = ( 1 , 1 , 2 , 4 ) ”i (1,1,1,1) = (- 1, ‘ 1,0,2) 

Set w 2 = (—1, —1,0,2). Then find 

(w 1; wa (w 2 ,w 2 ) 4 6 


= (1 3 _3 1) 

V2 ’ 2 ’ ’ > 

Clear fractions to obtain w 3 = (1,3,—6,2). Then w,, w 2 , vv 3 form an orthogonal basis of U. 
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(b) Normalize the orthogonal basis consisting of w l ,w 2 ,w 3 . Because ||wq|f =4, 11>v 2 11 2 = 6 , and 
11 w 3 11 2 = 50, the following vectors form an orthonormal basis of U: 


1 , 

«t = j l 1 - !• 1. !)> 


m 2 =^g(-l,-l, 0 , 2 ), « 3 = ^=( 1 , 3 ,- 6 , 2 ) 


7.22. Consider the vector space P(f) with inner product (/, g) = J 1 f(t)g(t ) dt. Apply the Gram- 
Schmidt algorithm to the set {1 ,t,t 2 } to obtain an orthogonal set {/o,/i,/ 2 } with integer 
coefficients. 

First set / 0 = 1. Then find 

,-M.. i = r_i.l = t-i 
( 1 , 1 ) 1 2 


Clear fractions to obtain f\ = 2t — 1. Then find 

(f 2 , 2 1 - 1) 

( 1 , 1 ) v "' / ( 2 1- 1 , 2t — 1 ) 


(i)- /j , r\/ / n (2?-i)=^ 2 -j(D-f(2t-i)=t 2 -t+^ 

1 3 

.2 


6 


Clear fractions to obtain / 2 = 6? 2 — 6 f + 1. Thus, {1, 2/ — 1, 6 f 2 — 6 t + 1} is the required orthogonal set. 


7.23. Suppose v — (1,3,5,7). Find the projection of v onto W or, in other words, find w C W that 
minimizes ||v — vtj|, where W is the subspace of R 4 spanned by 

(a) U\ = (1,1,1,1) and u 2 = (1, -3,4,-2), 

(b) v 1 — ( 1 , 1 , 1 , 1 ) and v 2 — ( 1 , 2 ,3,2). 


(a) Because u x and u 2 are orthogonal, we need only compute the Fourier coefficients: 

(u, zq) 1 + 3 + 5 + 7 16 

Ci = - 2 —-—— =-= — = 4 

(wj,z/j) l + l+ l + l 4 


_ (v,u 2 ) _ 1 -9 + 20- 14 _ -2_ 1 

C2 _ (m 2 ,m 2 ) “ 1+9+16 + 4 “ 30 “ “ 15 


Then w = proj(u, W) = c x u x + c 2 u 2 = 4(1,1,1,1) --^(1,-3,4,-2) = (ff,f ,ff ,§)■ 

(b) Because v 1 and v 2 are not orthogonal, first apply the Gram-Schmidt algorithm to find an orthogonal basis 
for W. Set Wi = v l = (1,1,1,1). Then find 


v 2 - 


(+, w i) 

(Wl.Wj) 1 


( 1 , 2 ,3, 2 ) 


r(l. 1 , 1 , 1 ) 


(- 1 , 0 , 1 , 0 ) 


Set w 2 = (—1,0,1,0). Now compute 

c = (^i) = i + 3 + 5 + 7 = l A = 4 

1 {w u w x ) l+l + l + l 4 

_ (w, w 2 ) -l + 0 + 5 + 0_4_ o 
(w 2 , w 2 ) 1 + 0+1+0 2 

Then w = proj(u, W) = c l w 1 + c 2 w 2 = 4(1,1,1.1) + 2(—1,0,1,0) = (2,4, 6 ,4). 


7.24. Suppose vi’, and uq are nonzero orthogonal vectors. Let v be any vector in V. Find c l and c 2 so that 
v' is orthogonal to and w 2 , where v 1 = v — c 1 w l — c 2 w 2 . 

If t/ is orthogonal to vv,, then 

0= {v-C^Wi -C 2 W 2 , W’,) = (v,w x ) -c^w^wq) -C 2 (w 2 ,Wj) 

= (v,Wj) - c^w^wq) - c 2 0 = (ujWj) - Cj(wq, wq) 

Thus, c x = {v , W\)/ (wi, w x ). (That is, C; is the component of v along viq.) Similarly, if t/ is orthogonal to w 2 , 
then 

o = (v - c 1 w 1 - c 2 w 2 , w 2 ) = (v, w 2 ) - c 2 (w 2 , w 2 ) 

Thus, c 2 = (v,w 2 )/(w 2 ,w 2 ). (That is, c 2 is the component of v along w 2 .) 
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7.25. Prove Theorem 7.8: Suppose w 1 ,w 2 ,..., w r form an orthogonal set of nonzero vectors in V. Let 
v £ V. Define 


v' — v — (cjM’j + c 2 w 2 + ■ ■ • + c r w r ), where c,- = 


[V, Wi, 




Then if is orthogonal to vv,, vv 2 ,.... vv,.. 

For i = 1,2,... ,r and using (w,-, Wj) = 0 for i ^ j, we have 

(v-c l w l - c 2 w 2 - c r w r , vv,-) = (v,Wi) - c^w^Wj) -- c r {w r ,w> 

= ( A w,-) - G ■ 0 - Ci(Wi, - c r ■ 0 


= {v,W t ) - Ci(Wi,Wi) = {v,Wf) - j' V ’ W, \ (Wi, Wi) = 0 

\ w v w i) 


The theorem is proved. 


7.26. Prove Theorem 7.9: Let {wj, v 2 ,, v n } be any basis of an inner product space V. Then there exists 
an orthonormal basis {;/, ,u 2 ,..., u n } of V such that the change-of-basis matrix from {} to {w,} is 
triangular; that is, for k — 1,2,..., n, 

u k ~ a kl v 1 + a k2 v 2 "I-+ a kk v k 


The proof uses the Gram-Schmidt algorithm and Remarks 1 and 3 of Section 7.7. That is, apply the 
algorithm to {«,-} to obtain an orthogonal basis {w,w„}, and then normalize {w,} to obtain an 
orthonormal basis {«,} of V. The specific algorithm guarantees that each w k is a linear combination of 
V] ,..., v k , and hence, each u k is a linear combination of v 1; ..., v k . 

7.27. Prove Theorem 7.10: Suppose S — {xv l , w 2 ,..., vv,.}, is an orthogonal basis for a subspace W of V. 
Then one may extend S to an orthogonal basis for V, that is, one may find vectors w r+1 ,..., w„ such 
that {w x ,w 2 ,..., vv„} is an orthogonal basis for V. 

Extend S to a basis S' = {wi, w r , iy +1 ,..., v n } for V. Applying the Gram-Schmidt algorithm to S', 
we first obtain w 1 ,w 2 , ■ ■ ■, w r because S is orthogonal, and then we obtain vectors w r+1 ,... ,w n , where 
{vv i, vv 2 ,..., w„} is an orthogonal basis for V. Thus, the theorem is proved. 

7.28. Prove Theorem 7.4: Let W be a subspace of V. Then V = W © W . 

By Theorem 7.9, there exists an orthogonal basis {u t ,..., u,.} of W, and by Theorem 7.10 we can extend 
it to an orthogonal basis {« 1; u 2 ,..., u n } of V. Hence, u r+x ,..., u n e W L . If v G V, then 

v = a x Ui + • • • + a n u n , where apvj + • • • + a,.u r G W and a r+ 1 n ,. +1 + • • • + a n u n G W 1 
Accordingly, V = W + W . 

On the other hand, if w G W fl W L , then (w, w) = 0. This yields w = 0. Hence, W fl W L = {0}. 

The two conditions V = W + W L and W fl W L = {0} give the desired result V = W © W 1 . 

Remark: Note that we have proved the theorem for the case that V has finite dimension. We 
remark that the theorem also holds for spaces of arbitrary dimension. 

7.29. Suppose IT is a subspace of a finite-dimensional space V. Prove that W = W 

By Theorem 7.4, V = W © W 1 , and also V = W 1 © 1T ±J “. Hence, 

dim W = dim V — dim VT 1 and dim = dim V — dim W L 
This yields dim W = dim VP ±± . But W C W (see Problem 7.14). Hence, W = VP ±J *, as required. 

7.30. Prove the following: Suppose vv,. vv 2 ,..., wy form an orthogonal set of nonzero vectors in V. Let v be 
any vector in V and let c, be the component of v along vv,-. Then, for any scalars ..., a,., we have 


r 


r 

v - E g- w a- 

< 

a k^k 

k= 1 


k= 1 


That is, YL c i w i i s the closest approximation to v as a linear combination of w 1 , 


w r . 
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By Theorem 7.8, v — E c k w k is orthogonal to every wy and hence orthogonal to any linear combination 
of vtq, w 2 , -.., w r . Therefore, using the Pythagorean theorem and summing from k = 1 to r, 

lk-E^|| 2 = \\v-J2 c k w k + E (a - a k )w k \\ 2 = \\v-Ec k M 2 +\\Uc k - a k )w k \\ 2 

> ii«-e<wa-ii 2 

The square root of both sides gives our theorem. 

7.31. Suppose {e x ,e 2 ,..., e,.} is an orthonormal set of vectors in V. Let v be any vector in V and let c i be 
the Fourier coefficient of v with respect to e r Prove Bessel’s inequality: 

E c? < H | 2 

k= 1 

Note that c ; = ( v , e,), because ||e ; || = 1. Then, using (e ; , e } ) = 0 for i ^ j and summing from k = 1 to r , 
we get 

0 < {v ~ E c k e k , v - E c k e k ) = {v, v) - 2(v, E c k e k ) + Eq = {v, ») - E e r) + E cf 

= ( v > v ) - E 2c < 2 + E = (v, v) - E 

This gives us our inequality. 


Orthogonal Matrices 

7.32. Find an orthogonal matrix P whose first row is u i = 

First find a nonzero vector vv 2 = (x,y,z) that is orthogonal to iq—that is, for which 

0 = («t,w 2 ) =| + y + y = 0 or x+2y + 2z = 0 

One such solution is w 2 = (0,1, — 1). Normalize w 2 to obtain the second row of P : 

« 2 = (0,1/V2, —l/v/2) 


Next find a nonzero vector vr 3 = 
0 = (mj,w 3 ) 
0 = (m 2 ,w 3 ) 

Set z = — 1 and find the solution vr 3 


(x, y, z) that is orthogonal to both iq and u 2 —that is, for which 
x 2 v 2 z 

; - + y + y= 0 or x+2y + 2z = 0 

y Z n n 

: r- - 7 = = 0 or y — Z = 0 

v/2 v/5 

= (4, —1, — 1). Normalize w 3 and obtain the third row of P\ that is, 


u 3 = (4/vT8,—i/ vTi,— 1 /vTi). 


Thus, 


P = 


t 

3 


0 

4/3v/2 



We emphasize that the above matrix P is not unique. 


-1/V2 

-l/3v/2 


7.33. Let A = 


1 

1 

7 


1 

3 

-5 


-1 

4 

2 


. Determine whether or not: (a) the rows of A are orthogonal; 


(b) A is an orthogonal matrix; (c) the columns of A are orthogonal. 

(a) Yes, because (1,1, — 1) • (1,3,4) = 1 + 3 — 4 = 0, (1,1 — 1) • (7, —5, 2) = 7 — 5 — 2 = 0, and 

(1,3,4)-(7, -5,2) = 7- 15 + 8 = 0. 

(b) No, because the rows of A are not unit vectors, for example, (1,1,—1)” = 1 + 1+1 = 3. 

(c) No; for example, (1,1,7) • (1,3, —5) = 1 + 3 — 35 = — 31 ^ 0. 


7.34. Let B be the matrix obtained by normalizing each row of A in Problem 7.33. 

(a) Find B. 

(b) Is B an orthogonal matrix? 

(c) Are the columns of B orthogonal? 
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(a) We have 

11(1,1,-1)|| 2 = 1 + 1 + 1 = 3, ||(1,3,4) || 2 = 1 + 9+ 16 = 26 

||(7, —5,2) || 2 = 49 + 25 + 4 = 78 

' l/v/3 lM -l/\/3‘ 

Thus, B = 1/V26 3/\/26 4/v/26 

_7/-/78 -5/yfJE 2 / 1/78 _ 

(b) Yes, because the rows of B are still orthogonal and are now unit vectors. 

(c) Yes, because the rows of B form an orthonormal set of vectors. Then, by Theorem 7.11, the columns of B 
must automatically form an orthonormal set. 

7.35. Prove each of the following: 

(a) P is orthogonal if and only if P T is orthogonal. 

(b) If P is orthogonal, then P 1 is orthogonal. 

(c) If P and Q are orthogonal, then PQ is orthogonal. 

(a) We have ( P T ) T = P. Thus, P is orthogonal if and only if PP T = I if and only if p^p 7 = / if and only if 
P T is orthogonal. 

(b) We have P T = P _1 , because P is orthogonal. Thus, by part (a), P _1 is orthogonal. 

(c) We have P T = P~ l and Q T = Q Thus, {PQ)(PQ) r = PQQ T P T = PQQ X P X = I. Therefore, 
( PQ) T = (PQ )~ l , and so PQ is orthogonal. 

7.36. Suppose P is an orthogonal matrix. Show that 

(a) (Pm, Pv) — ( u , v) for any u, v € V; 

(b) ||Pw|| = ||rr|| for every u £ V. 

Use P T P = I and ( u , v) = u T v. 

(a) ( Pu.Pv ) = ( Pu) T (Pv ) = u T P T Pv = u T v = (m, v). 

(b) We have 

||Pm|| 2 = (Pm, Pu) = u T P T Pu = u T u = (m,m) = \\u\\ 2 
Taking the square root of both sides gives our result. 

7.37. Prove Theorem 7.12: Suppose E = {e,} and E' = {/} are orthonormal bases of V. Let P be the 
change-of-basis matrix from E to E'. Then P is orthogonal. 

Suppose 

e i = baei+b i2 e 2 + --- + b in e n , i=l,...,n (1) 

Using Problem 7.18(b) and the fact that E' is orthonormal, we get 

bjj = (e'h e j) = b^bji + b i2 bj2 H-1- b in b jn (2) 

Let B = [by] be the matrix of the coefficients in (1). (Then P = B T .) Suppose BB T = [e/. Then 

c ij = bnbji + b i2 bj 2 H-1- b in bj n (3) 

By (2) and (3), we have c t j = dy. Thus, BB T = I. Accordingly, B is orthogonal, and hence, P = B J is 
orthogonal. 

7.38. Prove Theorem 7.13: Let {e u ..., e n } be an orthonormal basis of an inner product space V. Let 
P = [ciy] be an orthogonal matrix. Then the following n vectors form an orthonormal basis for V: 

e'i = a u e 1 + a 2i e 2 H-b a ni e n , i = 1,2 ,..., n 
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Because {e ; } is orthonormal, we get, by Problem 7.18(b), 

(e'i, e'j) = a u a v + a 2 i a 2j + • • • + a ni a nj = (C,-, C,) 

where C, denotes the ith column of the orthogonal matrix P = [a l; ] . Because P is orthogonal, its columns form 
an orthonormal set. This implies (e'j. e'j) = (C h Cj) = <5,y. Thus, {e-} is an orthonormal basis. 

Inner Products And Positive Definite Matrices 

7.39. Which of the following symmetric matrices are positive definite? 



'3 4' 


1 

oc 


'2 1" 


'3 5" 

II 

7?. 

4 5 

II 

=Q 

2 

-3 2 

,(c) C = 

1 -3 

,(d) D — 

5 9 


Use Theorem 7.14 that a 2 x 2 real symmetric matrix is positive definite if and only if its diagonal entries 
are positive and if its determinant is positive. 

(a) No, because |A| = 15 — 16 = — 1 is negative. 

(b) Yes. 

(c) No, because the diagonal entry —3 is negative. 

(d) Yes. 


7.40. Find the values of k that make each of the following matrices positive definite: 


(a) A 


2 

-4 



B = 


4 k 
k 9 


,(c) 


C 


k 5 
5 -2 


(a) First, k must be positive. Also, |A[ = 2k — 16 must be positive; that is, 2k — 16 > 0. Hence, k > 8. 

(b) We need |Bj =36 — k 2 positive; that is, 36 — k 2 > 0. Hence, k 2 < 36 or —6 < k < 6. 

(c) C can never be positive definite, because C has a negative diagonal entry —2. 


7.41. Find the matrix A that represents the usual inner product on R 2 relative to each of the following 
bases of R 2 : (a) {^=(1,3), v 2 = (2,5)}; (b) {wj = (1,2), w 2 = (4,-2)}. 

(a) Compute (v x , v x ) = 1 + 9 = 10, (v 1: v 2 ) = 2 + 15 = 17, {v 2 , v 2 ) = 4 + 25 = 29. Thus, 

10 17' 


A = 


17 29 


(b) Compute (wj, w x ) = 1 + 4 = 5, (\V\, w 2 ) = 4 — 4 = 0, (vv 2 , w 2 ) = 16 + 4 = 20. Thus, A = 
(Because the basis vectors are orthogonal, the matrix A is diagonal.) 


5 0 

0 20 


7.42. Consider the vector space P 2 (f) with inner product (/, g) = j' dt. 

(a) Find (f,g), where/(f) = t + 2 and g(t) = t 2 — 3f + 4. 

(b) Find the matrix A of the inner product with respect to the basis {1, f, f 2 } of V. 

(c) Verify Theorem 7.16 by showing that (f,g) = [/] 7 A[g] with respect to the basis {l,f, f 2 }. 


la) (f,g) = 


(t+2)(t 2 -3t + 4)dr = 


(t 3 — r — 2f + 8) dr = ( — — — — t 2 + 8f 


-l 


46 

y 


(b) Here we use the fact that if r + j = n. 


l ji+ 1 it 

f dt =-- 

_i n + 1 


2/(n +1) if n is even, 
0 if n is odd. 


Then (1,1) = 2, (1, t) = 0, (1, t 2 ) = |, (f, t) = |, (t, t 2 ) = 0, (f 2 , t 2 ) = I. Thus, 


A = 


2 0 f 
0 2 0 
2 6 ? 

L 3 U 5 J 
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(c) We have [f\ T = (2,1,0) and [g] r = (4, —3, 1) relative to the given basis. Then 


[f] T A[g\ = (2,1,0) 


'2 0 f 

4' 


4' 

o 1 0 

-3 

= (4,§,f) 

-3 

2 o ? 

L 3 u 5 J 

1 


1 


7.43. Prove Theorem 7.14: A = 
|A| = ad — b 2 is positive. 


a b 
b d 


= f =</,*> 


is positive definite if and only if a and d are positive and 


Let u = [x,y] r . Then 


f(u) = u t Au = [x,_y] 


a b 


X 

b d 


y_ 


= ax 2 + 2bxy + dy 2 


Suppose f(u) > 0 for every «/ 0. Then /(l,0) = a>0 and /(0, 1) = d > 0. Also, we have 
f(b, —a) = a(ad — b 2 ) > 0. Because a > 0, we get ad — b 2 > 0. 

Conversely, suppose a > 0, d > 0, ad — b 2 > 0. Completing the square gives us 

ad — b 2 


/ 2 b b 2 \ 

f(u) = a x 2 H- xy H-y 2 + dy 

V « a 2 ) ' 

Accordingly, f(u) > 0 for every « 0. 


2 b 1 2 


by 


->’ = a x + — + 


-y 


7.44. Prove Theorem 7.15: Let A be a real positive definite matrix. Then the function (u, v) = u r Av is an 
inner product on R". 

For any vectors « 1 ,« 2 - and v, 

(u l + u 2 , v ) = (iq + u 2 ) t Av = ( u\ + u 2 )Av = u\Av + u 2 Av = (u l , v) + (u 2 , v) 
and, for any scalar k and vectors u, v, 

(ku , v) = {ku) T Av = ku T Av = k(u, v) 

Thus [I] ] is satisfied. 

Because u T Av is a scalar, ( u T Av) T = u T Av. Also, A T = A because A is symmetric. Therefore, 

( u , v ) = u t Av = ( u t Av) T = v r A T u TT = v t Au = ( v , m) 

Thus, [I 2 ] is satisfied. 

Last, because A is positive definite, X 1 AX > 0 for any nonzero X (- R". Thus, for any nonzero vector 
v, (v , v) = v t Av > 0. Also, (0,0) = 0 r A0 = 0. Thus, [I 3 ] is satisfied. Accordingly, the function ( u , v) = Av 
is an inner product. 


7.45. Prove Theorem 7.16: Let A be the matrix representation of an inner product relative to a basis S of 
V. Then, for any vectors u. v G V. we have 

( u , v) = [u} t A[v] 

Suppose S = {vv ], w 2 ,... ,vr„} and A = [k^. Hence, k tJ = (w h Wj). Suppose 


Then 

On the other hand. 


a x w x + a 2 w 2 H-h a„w„ 


and 

v = b 


n 

n 


(u, v) = 

: v V ai bj{w ;, Wj) 
i=lj=\ 


kn 

k \2 ■ 

.. A;, 

u] t A[v\ = (a 1 ,fl 2 ,...,a„) 

kll 

kn ■ 

h 


-k„ i 

K 2 

.. k, 

/ n n 


n 

\ 

= E a i k iu E« 

\i—l i—l 

ikili 

• E 

i=l 

a i k i„ J 


Hn 

b-2n 


b \ 


= EJ2 a ibjkij 


j= 1 i=i 


( 1 ) 


( 2 ) 


Equations (1) and (2) give us our result. 
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7.46. Prove Theorem 7.17: Let A be the matrix representation of any inner product on V. Then A is a 
positive definite matrix. 

Because (w h w } ) = (wj, w t ) for any basis vectors w,- and Wj, the matrix A is symmetric. Let X be any 
nonzero vector in R". Then [u] — X for some nonzero vector u G V Theorem 7.16 tells us that 
X 7 AX = [K] r A[w] = ( u,u) > 0. Thus, A is positive definite. 


Complex Inner Product Spaces 

7.47. Let V be a complex inner product space. Verify the relation 

( u , av l + bv 2 ) = a(u, v { ) + b(u, v 2 ) 

Using [7f], [7f], and then [7f], we find 

(u,av l + bv 2 ) = (av l + bv 2 ,u) = a(vi,u) + b(v 2 ,u) = a{vi,u) + b{v 2 ,u) = a(u, Vj) + b(u, v 2 ) 


7.48. Suppose (u, v) = 3 + 2 7 in a complex inner product space V. Find 

(a) ((2 - 4i)u, v); (b) (u , (4 + 37)w); (c) ((3 — 6 i)u, (5 — 2 i)v). 

(a) ((2 — 4 i)u, v) = (2 — 4 i){u, v) = (2 — 47)(3 + 2i) = 14 — 8/ 

(b) (u, (4 + 3 i)v) = (4 + 3 i)(u, v) = (4 — 37) (3 + 27) = 18 — 7 

(c) ((3 - 67 )u, (5 - 27) v) = (3 - 67) (5 - 27) (u, v) = (3 - 67) (5 + 27) (3 + 27) = 129 - 187 


7.49. Find the Fourier coefficient (component) c and the projection cw of v = (3 + 47, 2 — 37) along 
w = (5 + 7, 27) in C 2 . 

Recall that c = (v,w)/(w,w). Compute 

(v,w) = (3 + 47)(5 +7) + (2 - 37)(27) = (3 + 47)(5 - 7) + (2 - 37)(—27) 

= 19+ 177- 6 -47 = 13 + 137 
(w, w) = 25 + 1 + 4 = 30 

Thus, c = (13 + 137)/30 = 55 + 557 . Accordingly, proj(i>, w) = cw = (yf + || 7, — + j^i) 


7.50. Prove Theorem 7.18 (Cauchy-Schwarz): Let V be a complex inner product space. Then 

| (u, n)| < ||u||||v||. 

If v = 0, the inequality reduces to 0 < 0 and hence is valid. Now suppose v ^ 0. Using zz = |z| 2 (for any 
complex number z) and (v, u) = ( u, v ), we expand ||w — (u, t))fz)|| 2 > 0, where t is any real value: 

0 < ||u — (u , v)tv ||“ = (u — ( u , v)tv, u — ( u , v)tv) 

= (u, u) — ( u , n)f(n, n) — (n, n)l(n,«) + (n, w)(n, v)t 2 (v, v) 

= ||m|| 2 — 2t\(u, n)|' + |(m, n)|‘'? 2 ||n||^ 


Set r = 1/IMI 2 to find 0 < ||w|| 2 — 


(u, n)|‘ 


-, from which (u, v)\ 2 < \H 2 \\v\\ 2 . Taking the square 


root of both sides, we obtain the required inequality. 


7.51. Find an orthogonal basis for u in C 3 where u = (1, 7, 1+7). 

Here u consists of all vectors 5 = (jc, y, z) such that 

(w, u) = x — iy + (1 — i)z = 0 

Find one solution, say vv, = (0, 1 — 7, 7). Then find a solution of the system 

x 7y + (1 i)z — 0, (1 + 7 )y - iz = 0 

Here z is a free variable. Set z = 1 to obtain y = 7/(1 + 7) = (1 + 7)/2 and x = (37 — 3)2. Multiplying by 2 
yields the solution w 2 = (37 — 3, 1 + 7, 2). The vectors wq and w 2 form an orthogonal basis for u 1 . 
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7.52. Find an orthonormal basis of the subspace W of C spanned by 

v 1 = (l,i, 0) and v 2 = (1, 2, 1 — /. 

Apply the Gram-Schmidt algorithm. Set w 1 = = (l,i,0). Compute 

v 2 ~ ^ 2,Wl \ w l = ( 1 , 2 , 1 - 0“-*-y^(l,hO) = (| + i, 1-H 1-i) 

Multiply by 2 to clear fractions, obtaining w 2 = (1 + 2i, 2 — i, 2 — 2 i). Next find ||vr 1 || = \fl and then 
11w 2 11 = \/T8. Normalizing {w 1 ,w 2 }, we obtain the following orthonormal basis of W: 


n / 2 ’ yf? 


0 


u 2 = 


1 + 2 i 2 — i 2 — 2 i 


’ V 18 ’ v/18 


7.53. Find the matrix P that represents the usual inner product on C 3 relative to the basis {1, i, 

Compute the following six inner products: 

<1.1} = 1, <1,») = i= -i, <l,l-i) =T^= 1 + i 

(i, i) = ii = 1, (i, 1 — 0 = 1(1 — 0 = —1 + i, (1 /. I i) = 2 


Then, using (u. v ) 


<u, n), we obtain 

P = 


1 

i 

1 — i 


— i 
1 


1+i 
— 1+i 
2 


(As expected, P is Hermitian; that is, P H = P.) 


1 - ;}. 


Normed Vector Spaces 

7.54. Consider vectors u = (1,3, —6,4) and v = (3, —5,1, —2) in R 4 . Find 

(a) Unjl^ and 11^11^, (b) ||m||j and j|v||j, (c) ||u|| 2 and ||f|| 2 , 

(d) d 00 (u,v),d l (u,v),d 2 (u,v). 


(a) The infinity norm chooses the maximum of the absolute values of the components. Hence, 

IMIoo = 6 and Halloo = 5 

(b) The one-norm adds the absolute values of the components. Thus, 

Hit = 1 + 3+ 6 + 4= 14 and M, = 3 + 5 + 1+2 = 11 

(c) The two-norm is equal to the square root of the sum of the squares of the components (i.e., the norm 
induced by the usual inner product on R 3 ). Thus, 

|| H || 2 = V /T+ 9 T 36 TT 6 = and ||v|| 2 =^9 + 25+1+4=^39 

(d) First find u — v = (—2, 8, —7,6). Then 

d^v) = ||u - vll^ = 8 
d l (u, v) = ||u — vHj = 2 + 8 + 7 + 6 = 23 
d 2 (u, v) = ||k - u|| 2 = V4 + 64 + 49 + 36 = x/l53 

7.55. Consider the function/(f) = t 2 — At in C[0,3]. 

(a) Find WfW^, (b) Plot /(f) in the plane R 2 , (c) Find \\f\\ lt (d) Find ||/|| 2 . 

(a) We seek || /’|| oc = max(| /(f)|). Because/(f) is differentiable on [0, 3], |/(f)| has a maximum at a critical 
point of/(f) (i.e., when the derivative/// = 0), or at an endpoint of [0,3]. Because/// = 2f — 4, we set 
2f — 4 = 0 and obtain f = 2 as a critical point. Compute 

/(2) = 4 — 8 = —4, /(0) = 0-0 = 0, 

Thus, 11/11^ = 1/(2)1 = 1—41=4. 


/(3) = 9 - 12 = -3 



260 


CHAPTER 7 Inner Product Spaces, Orthogonality 


(b) 


(c) 


Compute /(f) for vai'ious values of f in [0 

3], for example, 

t 

0 12 3 

fif) 

O 

1 

U) 

1 

1 

U) 


Plot the points in R 2 and then draw a continuous curve through the points, as shown in Fig. 7-8. 
We seek ||/||j = J Q 3 /(f)| dt. As indicated in Fig. 7-3,/(f) is negative in [0,3]; hence, 

|/(t)| = -(f 2 - 4 1) = 4f - f 2 

3 


Thus, 


(4/ — t 2 ) dt = I It 2 - 


= 18-9 = 9 


(d) 


ll/lll = 

Thus, || 


/(f) 2 dt = 




8f 3 + 16f 2 ) dt 



153 

~5~' 



Figure 7-8 


7.56. Prove Theorem 7.24: Let V be a normed vector space. Then the function d{u,v ) = \\u — v\\ 
satisfies the following three axioms of a metric space: 

[Mj d(u, v ) > 0; and d{u, v) = 0 iff u = v. 

[M 2 ] d(u, v) = d(v,u). 

[M 3 ] d(u, v) < d(u, w) + d(w, v). 

If u / v, then u — v / 0, and hence, d(u, v) = \\u — n|| > 0. Also, d(u, u) = ||u — n|| = ||0|| = 0. Thus, 
[Mj] is satisfied. We also have 

d(u,v ) = || u — v|| = || — 1 (u — w) || = | — 1| || w — z/|| = 11 u — u\\ = d(v,u) 
and d(u , v ) = \\u — u|| = ||(« — w) + (m^ — v)|| < \\u — w|| + ||w — u|| = d(u , w) + d(w, v ) 

Thus, [M 2 ] and [M 3 ] are satisfied. 


SUPPLEMENTARY PROBLEMS 


Inner Products 

7.57. Verify that the following is an inner product on R 2 , where u = (x l ,x 2 ) and v = (}>| ,>' 2 ): 

/(m, v) = x 1 y 1 - 2x l y 2 - 2x 2 y x + 5x 2 y 2 

7.58. Find the values of k so that the following is an inner product on R 2 , where u = (x l ,x 2 ) and v = (y,, y 2 ): 


f(u, v) = x x y x - 3x x y 2 - 3x 2 y x + kx 2 y 2 













CHAPTER 7 Inner Product Spaces, Orthogonality 


261 


7.59. Consider the vectors u = (1, —3) and v = (2,5) in R 2 . Find 

(a) {u, v) with respect to the usual inner product in R 2 . 

(b) (m, v) with respect to the inner product in R 2 in Problem 7.57. 

(c) ||t;|| using the usual inner product in R 2 . 

(d) ||?;|| using the inner product in R 2 in Problem 7.57. 

7.60. Show that each of the following is not an inner product on R 3 , where u = (x l ,x 2 ,x 3 ) and v = (y, ,>’2,>’3): 

(a) (u, v) = x x y x + x 2 y 2 , (b) (u, v ) = x l y 2 x 3 + v ix 2 y 3 . 

7.61. Let V be the vector space of m x n matrices over R. Show that (A, B) = tr (B T A) defines an inner product 
in V. 

7.62. Suppose |(n, u)| = |m|| ||u||. (That is, the Cauchy-Schwarz inequality reduces to an equality.) Show that u and 
v are linearly dependent. 

7.63. Suppose/(n, v) and g(u, v) are inner products on a vector space V over R. Prove 

(a) The sum/ + g is an inner product on V, where (/ + g)(u, v) =f(u , v) + g(u, v). 

(b) The scalar product kf, for k > 0, is an inner product on V, where ( kf)(u , v) = kf(u, v). 


Orthogonality, Orthogonal Complements, Orthogonal Sets 

7.64. Let V be the vector space of polynomials over R of degree <2 with inner product defined by 
(f,g) = $ 0 f(t)g(t) dt. Find a basis of the subspace W orthogonal to h(t) = 2 1+ 1. 

7.65. Find a basis of the subspace W of R 4 orthogonal to rq = (1, —2, 3,4) and u 2 = (3, —5,7, 8). 

7.66. Find a basis for the subspace W of R 5 orthogonal to the vectors u x = (1,1,3,4, 1) and u 2 = (1,2,1,2,1). 

7.67. Let w = (1. —2, —1,3) be a vector in R 4 . Find 

(a) an orthogonal basis for vv 1 , (b) an orthonormal basis for w 1 . 

7.68. Let W be the subspace of R 4 orthogonal to u i = (1,1,2,2) and u 2 = (0, 1,2, —1). Find 

(a) an orthogonal basis for W, (b) an orthonormal basis for W. (Compare with Problem 7.65.) 

7.69. Let S consist of the following vectors in R 4 : 

= (1,1,1,1), u 2 = (1.1.-1,-1), u 3 = (1,-1,1,-1), m 4 = (1,-1,-1,1) 

(a) Show that S is orthogonal and a basis of R 4 . 

(b) Write v = (1,3, —5, 6) as a linear combination of u l , n 2 , n 3 , m 4 . 

(c) Find the coordinates of an arbitrary vector v = (a, b, c, d) in R 4 relative to the basis S. 

(d) Normalize S to obtain an orthonormal basis of R 4 . 


7.70. Let M = M 2 2 with inner product ( A,B) = tr (B T A). Show that the following is an orthonormal basis for M: 


1 0 


0 1 


0 0 


0 0 

1 

0 0 

5 

1 0 

? 


7.71. Let M = M 2 2 with inner product (A, B) = tr {B T A). Find an orthogonal basis for the orthogonal complement 
of (a) diagonal matrices, (b) symmetric matrices. 
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7.72. Suppose {iq, m 2 , • • ■, u r } is an orthogonal set of vectors. Show that {k l u ll k 2 u 2l ■ ■ ■ ,k r u r } is an orthogonal set 
for any scalars k \, k 2l ..., k r . 

7.73. Let U and W be subspaces of a finite-dimensional inner product space V. Show that 

(a) (U+ W) 1 - = c/ 2 - n w- 1 , (b) (u n W) L = u ± + w ± . 

Projections, Gram-Schmidt Algorithm, Applications 

7.74. Find the Fourier coefficient c and projection cw of v along w, where 

(a) v = (2,3, —5) and w = (1, —5, 2) in R 3 . 

(b) v = (1,3,1,2) and w = (1,—2,7,4) in R 4 . 

(c) v = r and w = t + 3 in P(f), with inner product (/, g) = J 0 ' f(t)g(t) dt 

and w = ^ ^ in M = M 2j2 , with inner product (A, B) = tr {B T A). 

7.75. Let U be the subspace of R 4 spanned by 

Vl = (1,1,1,1), v 2 = (1, —1,2,2), ^3 = (1,2, —3, —4) 

(a) Apply the Gram-Schmidt algorithm to find an orthogonal and an orthonormal basis for U. 

(b) Find the projection of v = (1,2, —3,4) onto U. 

7.76. Suppose v= (1,2, 3,4,6). Find the projection of v onto W, or, in other words, find w 6 W that minimizes 
|| v — w||, where W is the subspace of R 5 spanned by 

(a) m, = (1 ,2,1,2,1) and m 2 = (1,-1,2,-1,1), (b) v l = (1,2,1,2,1) and v 2 = (1,0,1,5,-1). 

7.77. Consider the subspace W = P 2 (f) of P(?) with inner product (f,g) = | ( ! f(t)g{t) dt. Find the projection of 
/(f) = f 3 onto W. (Hint: Use the orthogonal polynomials 1,2 1 — 1, 6t 2 — 6t + 1 obtained in Problem 7.22.) 

7.78. Consider P(f) with inner product (f,g) = J(!, f(t)g(t) dt and the subspace W = P 3 (t). 

(a) Find an orthogonal basis for W by applying the Gram-Schmidt algorithm to {l,f,f 2 ,f 3 }. 

(b) Find the projection of /(f) = f 5 onto W. 

Orthogonal Matrices 

fi jc 

7.79. Find the number and exhibit all 2 x 2 orthogonal matrices of the form 3 

_y Z 

7.80. Find a 3 x 3 orthogonal matrix P whose first two rows are multiples of u = (1,1,1) and v = (1, : —3,2), 
respectively. 

7.81. Find a symmetric orthogonal matrix P whose first row is (|,j,=). (Compare with Problem 7.32.) 

7.82. Real matrices A and B are said to be orthogonally equivalent if there exists an orthogonal matrix P such that 
B = P t AP. Show that this relation is an equivalence relation. 

Positive Definite Matrices and Inner Products 

7.83. Find the matrix A that represents the usual inner product on R 2 relative to each of the following bases: 

(a) {th = (1,4), v 2 = (2, —3)}, (b) {w x = (1,-3), w 2 =(6,2)}. 

7.84. Consider the following inner product on R 2 : 

/(h, v) = xm - 2x l y 2 - 2 x ^1 + 5x^2, where u = (xj,x 2 ) v= (y u y 2 ) 

Find the matrix B that represents this inner product on R 2 relative to each basis in Problem 7.83. 
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7.85. Find the matrix C that represents the usual basis on R 3 relative to the basis S of R 3 consisting of the vectors 

«i = (1,1,1). u 2 = (1,2,1), u 3 = (1,-1,3). 

7.86. Let V = P 2 (f) with inner product (f,g) = J 0 * f(t)g(t) dt. 

(a) Find ( f,g ), where/(r) = t + 2 and g(t) = t 2 — 3t + 4. 

(b) Find the matrix A of the inner product with respect to the basis {l,f, t 2 } of V. 

(c) Verify Theorem 7.16 that (f,g) = [/] r A[g] with respect to the basis {1 ,t,t 2 }. 

7.87. Determine which of the following matrices are positive definite: 


' 1 3' 


'3 4' 


<N 


6 -7' 

3 5 

, (b) 

4 7 

. (c) 

2 1 

, (d) 

-7 9 


7.88. Suppose A and B are positive definite matrices. Show that: 

(a) A + B is positive definite and (b) kA is positive definite for k > 0. 

7.89. Suppose B is a real nonsingular matrix. Show that: (a) B T B is symmetric and (b) B T B is positive definite. 


Complex Inner Product Spaces 

7.90. Verify that 

(«i«i +a 2 u 2 Vh + b i v i) = v \) + a x b 2 (u u v 2 ) + a 2 b x {u 2 , ttj) + a 2 b 2 (u 2 , v 2 ) 

More generally, prove that (E* i a t u h E". i b j v j) = Y.ij a ibj{ u h v i)- 

7.91. Consider u = (1 + i, 3, 4 — i) and v = (3 — 4 i, 1 + i , 2 i) in C 3 . Find 
(a ){u,v), (b ) {v,u), (c) 11«11, (d) ||d||, (e) d(u,v). 


7.92. Find the Fourier coefficient c and the projection cw of 

(a) u = (3 + t, 5 — 2 i) along w = (5 + i, 1 + i) in C 2 , 

(b) u = (1 — i, 3 i, 1 + i) along vv = (1, 2 — i, 3 + 2 i) in C 3 . 

7.93. Let u = (z,, z 2 ) and v = (w 1 ,w 2 ) belong to C 2 . Verify that the following is an inner product of C 2 : 

/(«, v) = Z\W\ + (1 + i)ziw 2 + (1 - i)z 2 w i + 3 z 2 w 2 


7.94. 


Find an orthogonal basis and an orthonormal basis for the subspace W of C 3 spanned by = (1, i, 1) and 
u 2 — (1 + i, 0, 2). 


7.95. 


Let u = (.Zj, z 2 ) and v 
product on C 2 ? 


(vv,, vv 2 ) belong to C 2 . For what values of a,b,c,d G C is the following an inner 
f(u, v ) = az 1 w 1 + bz\W 2 + cz 2 w x + dz 2 w 2 


7.96. Prove the following form for an inner product in a complex space V: 

(u,v) =\\\u+v\\ 2 - |||u - v\\ 2 + \\\u + iv\\ 2 ^\\\u- ivf 
[Compare with Problem 7.7(b).] 

7.97. Let V be a real inner product space. Show that 

(i) |«|| = r if and only if (u + v, u — v) = 0; 

(ii) \\u + u||" = ||m|| 2 + ||v||~ if and only if (u, v) = 0. 

Show by counterexamples that the above statements are not true for, say, C 2 . 


7.98. Find the matrix P that represents the usual inner product on C 3 relative to the basis (1, 1 + i, 1 — 2i}. 
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7.99. A complex matrix A is unitary if it is invertible and A~ 1 = A H . Alternatively, A is unitary if its rows (columns) 

form an orthonormal set of vectors (relative to the usual inner product of C"). Find a unitary matrix whose 
first row is: (a) a multiple of (1, 1 —/); (b) a multiple of (1, { — ji). 

Normed Vector Spaces 

7.100. Consider vectors u = (1, —3,4,1, —2) and v = (3,1, —2, —3,1) in R 5 . Find 

( a ) IMloo and IMloo’ (b) IMl! and IMl!, (c) ||u|| 2 and ||«|| 2 , (d) d x (u, v),d x (u, v), d 2 (u, v) 

7.101. Repeat Problem 7.100 for u = (1 + i, 2 — 4 i) and v = (1 — i, 2 + 3 i) in C 2 . 

7.102. Consider the functions f(t) = 5 1 — t 2 and g(t) = 3? — t 2 in C[0,4]. Find 

(a) d«,(/,«), (b)c?i(/,g), (c )d 2 (f,g) 

7.103. Prove (a) || • ||j is a norm on R". (b) || ■ is a norm on R". 

7.104. Prove (a) || • ||j is a norm on C[a,b\. (b) || ■ is a norm on C[a,b\. 


ANSWERS TO SUPPLEMENTARY PROBLEMS 


Notation : M = [R x : R 2 ; ...] denotes a matrix M with rows R l ,R 2 ,.... Also, basis need not be unique. 

7.58. k>9 

7.59. (a) -13, (b) -71, (c) \/29, (d) 

7.60. Let u = (0,0,1): then (. u,u) = 0 in both cases 

7.64. {7r 2 - 5 1 , 12 1 2 - 5} 

7.65. {(1,2,1,0), (4,4,0,1)} 

7.66. (-1,0,0,0,1), (-6,2,0.1,0), (-5,2,1,0,0) 

7.67. (a) iti = (0,0,3,1), m 2 = (0,5, — 1,3), m 3 = (—14, —2, —1,3), 

(b) /v^lo, u 2 /V35, u 3 /\/ 210 

7.68. (a) (0,2,-1,0), (-15,1,2,5), (b) (0,2,-1,0)/V5, (-15,1,2,5)/v/255 

7.69. (b) v = | (5kj + 3 u 2 — 13m 3 + 9 u 4 ), 

(c) [tt] = \ [a + b + c + d, a + b — c — d, a — b + c — cl, a — b — c + d] 

7.71. (a) [0,1; 0,0], [0,0; 1,0], (b) [0,-1; 1,0] 

7.74. (a) c = -|, (b) c = i (c) c = JjL (d) c = f 6 

7.75. (a) w l = (1,1,1. l),w 2 = (0, —2,1,1), w 3 = (12, —4, —1, —7), 

(b) proj(v, U) = j(—1,12,3,6) 

7.76. (a) proj(i), W) = {(23,25, 30,25,23), (b) First find an orthogonal basis for W\ 

say, tv, = (1,2,1,2,1) and vv 2 = (0, 2,0, —3,2). Then proj(v, W) = yj (34, 76, 34, 56,42) 


7.77. proj(/,W) = |r 2 -|r + i 





CHAPTER 7 Inner Product Spaces, Orthogonality 


265 


7.78. 

(a) 

( 1 , 

t, 3 / 2 - 1, 5 / 3 - 3r}, 

proj(/, W) = f 1 

3 _+ f 

21 ' 

7.79. 

Four: [a, l 

1 

. 

fT 

5r 

1 

j3- 

1 

■a], [a, —b; b, a]. 

[a, — £>; —fc, —a], where a = 1 and h = 1 \/8 

7.80. 

P = 

[1 Ah 

1/a,1/a; l/b,—3/b,2/b; 5/c, — 1/c, - 

-4/c], where a = s/3, b = '/lA, c = \/A2 

7.81. 

|[i, 

2 , 2 ; 

2 ,- 2 , 1 ; 2 , 1 ,- 2 ] 



7.83. 

(a) 

[17,- 

-10; -10,13], (b) 

[10,0; 0,40] 


7.84. 

(a) 

[65,- 

- 68 ; -68,73], (b) 

[58,8; 8 , 8 ] 


7.85. 

[3,4 

-,3; 

4,6,2; 3,2,11] 



7.86. 

(a) 

83 

12 ’ 

(b) [1 ,a,b; a,b,c; 

b, c,d\, where a = 

- 5 , b = 5 , c = |, r/ = j 

7.87. 

(a) 

No, 

(b) Yes, (c) No, (d) Yes 


7.91. 

(a) 

-4i, 

(b) Ai, (c) v/ 2B, (d) V3\, 

(e) v/59 

7.92. 

(a) 

C = ; 

28 (19 - 5/), (b) c = 

■ 79 ( 3 + 6 /) 


7.94. 

{«! 

= ( 1 , 

§ 

II 

"to 

1 

31, 3 — /)/v/24} 



7.95. a and d real and positive, c = b and ad — be positive. 

7.97. u=( 1,2), v = (i,2i) 

7.98. P=[l, 1 — /, 1 + 2/; 1+/, 2, -1 + 3/; 1-2/, -1-3/, 5] 

7.99. (a) (1/V3)[1, 1 -/; 1 + i, -1], 

(b) [a, ai, a — ai ; hi, b, 0; a, ai, — a — ai\, where a = | and = 1 /\/2- 

7.100. (a) 4 and 3, (b) 11 and 10, (c) s/3\ and \/24, (d) 6, 19,9 

7.101. (a) s/20 and y/l3, (b) s/2 + ^20 and s/2 + a/I3, (c) s/22 and \/l5, (d) 7,9, \/53 

7.102. (a) 8, (b) 16, (c) 16/>/3 




Determinants 


8.1 Introduction 


Each n-square matrix A — [a-2 is assigned a special scalar called the determinant of A, denoted by det(A) 
or |A| or 


a ll 

a \2 ■ 

■ ■ a \n 

a 21 

a 22 ■ 

■ ■ a 2n 

a„i 

a „ 2 ■ 

d nn 


We emphasize that an n x n array of scalars enclosed by straight lines, called a determinant of order n, is 
not a matrix but denotes the determinant of the enclosed array of scalars (i.e., the enclosed matrix). 

The determinant function was first discovered during the investigation of systems of linear equations. 
We shall see that the determinant is an indispensable tool in investigating and obtaining properties of 
square matrices. 

The definition of the determinant and most of its properties also apply in the case where the entries of a 
matrix come from a commutative ring. 

We begin with a special case of determinants of orders 1, 2, and 3. Then we define a determinant of 
arbitrary order. This general definition is preceded by a discussion of permutations, which is necessary for 
our general definition of the determinant. 


8.2 Determinants of Orders 1 and 2 


Determinants of orders 1 and 2 are defined as follows: 


a n \ = a 


n 


and 


“11 U 12 
a 2l a 22 


— a U a 22 ~ a l2 a 2l 


Thus, the determinant of a 1 x 1 matrix A = [a n ] is the scalar a n ; that is, det(A) = «[ J = a n . The 
determinant of order two may easily be remembered by using the following diagram: 





That, is, the determinant is equal to the product of the elements along the plus-labeled arrow minus the 
product of the elements along the minus-labeled arrow. (There is an analogous diagram for determinants of 
order 3, but not for higher-order determinants.) 

EXAMPLE 8.1 

(a) Because the detemiinant of order 1 is the scalar itself, we have: 

det(27) = 27, det(-7) = -7, det(f - 3) = t - 3 


(b) 


5 3 
4 6 


= 5(6)-3(4) = 30-12= 18, 


3 2 
-5 7 


= 21 + 10= 31 
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Application to Linear Equations 

Consider two linear equations in two unknowns, say 


a x x + b x y = c x 
a 2 x + b 2 y = c 2 


Let D = a , b 2 — a 2 b x . the determinant of the matrix of coefficients. Then the system has a unique solution 
if and only if In such a case, the unique solution may be expressed completely in terms of 

determinants as follows: 



Cj by 


a { c x 

_ N x _ b 2 c x - b x c 2 _ 

O’ h 2 

,_ N y_ a \C 2 ~ «2Ci _ 

ai c 2 

D a x b 2 — a 2 b x 

a i by 

a 2 b 2 

D a x b 2 — a 2 b x 

a x b x 
a 2 b 2 


Here D appears in the denominator of both quotients. The numerators N x and N y of the quotients for x and 
y, respectively, can be obtained by substituting the column of constant terms in place of the column of 
coefficients of the given unknown in the matrix of coefficients. On the other hand, if D = 0, then the 
system may have no solution or more than one solution. 


EXAMPLE 8.2 


Solve by determinants the system 


f 4x — 3y — 15 
\ 2x + 5y — 1 


First find the determinant D of the matrix of coefficients: 


D = 


4 

2 


-3 

5 


4(5) - (—3)(2) = 20 + 6 = 26 


Because D ^ 0, the system has a unique solution. To obtain the numerators N x and N y , simply replace, in the matrix 
of coefficients, the coefficients of x and y, respectively, by the constant terms, and then take their determinants: 


N r = 


15 

-3 

= 75 + 3 = 78 N v = 

4 

15 

1 

5 

y 

2 

1 


= 4 - 30 = -26 


Then the unique solution of the system is 


x = 


N , 

D 



N y _ -26 
D ~^6 


-1 


8.3 Determinants of Order 3 


Consider an arbitrary 3x3 matrix A = [aJ. The determinant of A is defined as follows: 


det(A) 


flu 

a 12 

a l3 

a 2l 

a 22 

a 23 

a 3l 

a 32 

a 33 


a U a 22 a 33 + a l2 a 23 a 3l + a \3 a 2l a 32 ~ a \3 a 22 a 3\ ~ a \2 a 2l a 33 ~ a \\ a 23 a 32 


Observe that there are six products, each product consisting of three elements of the original matrix. 
Three of the products are plus-labeled (keep their sign) and three of the products are minus-labeled (change 
their sign). 

The diagrams in Fig. 8-1 may help us to remember the above six products in det(A). That is, the 
determinant is equal to the sum of the products of the elements along the three plus-labeled arrows in 
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Fig. 8-1 plus the sum of the negatives of the products of the elements along the three minus-labeled arrows. 
We emphasize that there are no such diagrammatic devices with which to remember determinants of 
higher order. 




'2 

1 

r 


3 

2 

r 

EXAMPLE 8.3 Let A = 

0 

5 

-2 

and B = 

-4 

5 

-1 


1 

-3 

4 


2 

-3 

4 


. Find det(A) and det(fi). 


Use the diagrams in Fig. 8-1: 

det(A) = 2(5) (4) + 1 ( 2)( 1) + l(-3)(0) - 1(5)(1) - (-3)(-2)(2) - 4(1)(0) 
= 40-2 + 0-5-12-0 = 21 
det(fl) = 60 - 4 + 12 - 10 - 9 + 32 = 81 


Alternative Form for a Determinant of Order 3 

The determinant of the 3x3 matrix A = [ci-2 may be rewritten as follows: 

det(A) = fln(fl22 a 23 — a 23 a 3l) ~ a l2( a 21 a 33 ~ a 23 a 3l) + a \3( a 2l a 32 ~ a 22 a 3l) 



a 22 

a 23 


a 21 

a 23 

+ a 13 

a 2l 

a 22 

a n 



— a n 






a 32 

a 33 


a 3l 

a 33 


a 3l 

a 32 


which is a linear combination of three determinants of order 2 whose coefficients (with alternating signs) 
form the first row of the given matrix. This linear combination may be indicated in the form 



fl ll 

a l2 

a 13 


a it 

a \2 

a l3 


a \\ 

a l2 

a l3 

«n 

a 2\ 

a 22 

a 23 

~ a 12 

fl 21 

a 22 

a 23 

+ a \3 

a 21 

a 22 

a 23 


a 3\ 

a 32 

a 33 


a 3\ 

a 32 

a 33 


a 31 

a 32 

a 33 


Note that each 2x2 matrix can be obtained by deleting, in the original matrix, the row and column 
containing its coefficient. 

EXAMPLE 8.4 


1 

2 

3 


1 

2 

3 



1 

2 

3 


1 

2 

3 

4 

-2 

3 

= 1 

4 

-2 

3 


- 2 

4 

-2 

3 

+ 3 

4 

-2 

3 

0 

5 

-1 


0 

5 

-1 



0 

5 

-1 


0 

5 

-1 





-2 

3 



4 

3 


4 

-2 







= 1 

5 

-1 


2 

0 

-1 

+ 3 

0 

5 





= 1(2 - 15) - 2(—4 + 0) + 3(20 + 0) = -13 + 8 + 60 = 55 
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8.4 Permutations 

A permutation o of the set {1,2,..., n} is a one-to-one mapping of the set onto itself or, equivalently, a 
rearrangement of the numbers 1,2,..., n. Such a permutation a is denoted by 

T ”) or ® ~ jiii'' 'ini where j t = <r(i) 

\J 1 J2 ■ ■ ■ Jn J 

The set of all such permutations is denoted by S n , and the number of such permutations is n\. If a G S„, 
then the inverse mapping rr 1 6 S n ; and if ct,t € S n , then the composition mapping trot £ S n . Also, the 
identity mapping e = a o a 1 e S„. (In fact, e = 123 ... n.) 

EXAMPLE 8.5 

(a) There are 2! = 2 • 1 = 2 permutations in St; they are 12 and 21. 

(b) There are 3! = 3 • 2 • 1 = 6 permutations in S 3 ; they are 123, 132, 213, 231, 312, 321. 


Sign (Parity) of a Permutation 

Consider an arbitrary permutation a in S n , say a = jj 2 ■ ■ ■ ,j„■ We say a is an even or odd permutation 
according to whether there is an even or odd number of inversions in a. By an inversion in o we mean a 
pair of integers (i, k) such that i > k. but i precedes k in a. We then define the sign or parity of o, written 
sgn a, by 

f 1 if c is even 
sgno-j.i ifo . isodd 


EXAMPLE 8.6 

(a) Find the sign of a = 35142 in S 5 . 

For each element k, we count the number of elements i such that i > k and i precedes k in o. There are 

2 numbers (3 and 5) greater than and preceding 1, 

3 numbers (3,5, and 4) greater than and preceding 2, 

1 number (5) greater than and preceding 4. 

(There are no numbers greater than and preceding either 3 or 5.) Because there are, in all, six inversions, a is 
even and sgn a = 1. 

The identity permutation e = 123... n is even because there are no inversions in e. 

In S 2 , the permutation 12 is even and 21 is odd. In S 2 , the permutations 123, 231, 312 are even and the 
permutations 132, 213, 321 are odd. 

Let t be the permutation that interchanges two numbers i and j and leaves the other numbers fixed. That is, 

<0 = j, v{j) = h *{k) = k , where k ± i,j 

We call t a transposition. If i <j, then there are 2( j — i) — 1 inversions in t, and hence, the transposition r 
is odd. 

Remark: One can show that, for any n, half of the permutations in S n are even and half of them are 
odd. For example, 3 of the 6 permutations in S 3 are even, and 3 are odd. 


(b) 

(c) 

(d) 


8.5 Determinants of Arbitrary Order 

Let A = [a.;.] be a square matrix of order n over a field K. 

Consider a product of n elements of A such that one and only one element comes from each row and one 
and only one element comes from each column. Such a product can be written in the form 

a Vi a 2h ''' a nj„ 



270 


CHAPTER 8 Determinants 


that is, where the factors come from successive rows, and so the first subscripts are in the natural order 
1,2,...,//. Now because the factors come from different columns, the sequence of second subscripts 
forms a permutation a = j l j 2 ■ ■ ■ j n in S„. Conversely, each permutation in S n determines a product of the 
above form. Thus, the matrix A contains n\ such products. 

DEFINITION 8.1: The determinant of A = [a^], denoted by det(A) or |A|, is the sum of all the above n\ 
products, where each such product is multiplied by sgn a. That is, 

\A\ = Y (sgn o)a y a 2]i ■ ■ ■ a njn 

a 

or \ A \ = E ( s g n a ) a \o(\) a 2o(2)''' a m {n) 

oES„ 


The determinant of the //-square matrix A is said to be of order n. 

The next example shows that the above definition agrees with the previous definition of determinants 
of orders 1, 2, and 3. 


EXAMPLE 8.7 


(a) Let A = [a u ] be a 1 x 1 matrix. Because has only one permutation, which is even, det(A) = a n , the number 
itself. 


(b) Let A = [ciy] be a 2 x 2 matrix. In S 2 . the permutation 12 is even and the permutation 21 is odd. Hence, 


det(A) 


a \ i 
a 21 


a 12 
a 22 


— Cl n a 2 2 — a 12 a 2 l 


(c) Let A = [a-] be a 3 x 3 matrix. In S 2 , the permutations 123, 231, 312 are even, and the permutations 321, 213, 
132 are odd. Hence, 


det(A) 


flu 

a l2 

a \3 

fl 21 

a 22 

a 23 

fl 3 l 

a 32 

a 33 


a n a 22 a 33 + a \2 a 23 a 3l + fl 13 fl 21 a 32 — a 13 a 22 a 31 — a 12 a 21 a 33 — fl ll fl 23 a 32 


Remark: As n increases, the number of terms in the determinant becomes astronomical. 
Accordingly, we use indirect methods to evaluate determinants rather than the definition of the 
determinant. In fact, we prove a number of properties about determinants that will permit us to shorten 
the computation considerably. In particular, we show that a determinant of order n is equal to a linear 
combination of determinants of order n — 1, as in the case n — 3 above. 


8.6 Properties of Determinants 


We now list basic properties of the determinant. 

THEOREM 8.1: The determinant of a matrix A and its transpose A T are equal; that is, |A| = |A r |. 

By this theorem (proved in Problem 8.22), any theorem about the determinant of a matrix A that 
concerns the rows of A will have an analogous theorem concerning the columns of A. 

The next theorem (proved in Problem 8.24) gives certain cases for which the determinant can be 
obtained immediately. 

THEOREM 8.2: Let A be a square matrix. 

(i) If A has a row (column) of zeros, then |A| =0. 

(ii) If A has two identical rows (columns), then |A| =0. 
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(iii) If A is triangular (i.e., A has zeros above or below the diagonal), then 
|A| = product of diagonal elements. Thus, in particular, |/| = 1, where I is 
the identity matrix. 

The next theorem (proved in Problems 8.23 and 8.25) shows how the determinant of a matrix is affected 
by the elementary row and column operations. 

THEOREM 8.3: Suppose B is obtained from A by an elementary row (column) operation. 

(i) If two rows (columns) of A were interchanged, then \B\ = — |A|. 

(ii) If a row (column) of A were multiplied by a scalar k, then \B\ = k\A\. 

(iii) If a multiple of a row (column) of A were added to another row (column) of A, 
then \B\ = |A|. 

Major Properties of Determinants 

We now state two of the most important and useful theorems on determinants. 

THEOREM 8.4: The determinant of a product of two matrices A and B is the product of their 
determinants; that is, 

det(AR) = det(A) det(5) 

The above theorem says that the determinant is a multiplicative function. 

THEOREM 8.5: Let A be a square matrix. Then the following are equivalent: 

(i) A is invertible; that is, A has an inverse A -1 . 

(ii) AX — 0 has only the zero solution. 

(iii) The determinant of A is not zero; that is, det(A) f 0. 

Remark: Depending on the author and the text, a nonsingular matrix A is defined to be an invertible 
matrix A, or a matrix A for which |A| f 0, or a matrix A for which AX = 0 has only the zero solution. The 
above theorem shows that all such definitions are equivalent. 

We will prove Theorems 8.4 and 8.5 (in Problems 8.29 and 8.28, respectively) using the theory of 
elementary matrices and the following lemma (proved in Problem 8.26), which is a special case of 
Theorem 8.4. 

LEMMA 8.6: Let E be an elementary matrix. Then, for any matrix A, \EA\ = |£||A|. 

Recall that matrices A and B are similar if there exists a nonsingular matrix P such that B = P l AP. 
Using the multiplicative property of the determinant (Theorem 8.4), one can easily prove (Problem 8.31) 
the following theorem. 

THEOREM 8.7: Suppose A and B are similar matrices. Then |A| = \B\. 


8.7 Minors and Cofactors 

Consider an //-square matrix A = [ay]. Let My denote the (/; — 1)-square submatrix of A obtained by 
deleting its z'th row and /th column. The determinant M t - is called the minor of the element ay of A, and we 
define the cofactor of a ip denoted by Ay, to be the “signed” minor: 

Ay = (—l) i+J |My| 
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Note that the “signs” ( — 1)' 1 accompanying the minors form a chessboard pattern with +’s on the main 
diagonal: 



We emphasize that My denotes a matrix, whereas Ay denotes a scalar. 

Remark: The sign (— 1)' 1 of the cofactor Ay is frequently obtained using the checkerboard pattern. 
Specifically, beginning with + and alternating signs: 


count from the main diagonal to the appropriate square. 

. Find the following minors and cofactors: (a) |M 23 | and A 23 , (b) |M 31 1 


EXAMPLE 8.8 Let A = 

and A 31 . 


1 2 3 
4 5 6 
7 8 9 


(a) |M 23 | = 


(b) |M 31 | = 


1 2 3 
4 5 6 = 
7 8 9 

1 2 3 
4 5 6 = 
7 8 9 


1 2 

7 8 

2 3 
5 6 


= 8 — 14 = —6, and so A 23 = (—1) 2+3 |A<f 23 1 = —(—6) = 6 


= 12 - 15 = -3, and so A 31 = (-1) 1+3 |M 31 | = +(-3) = -3 


Laplace Expansion 

The following theorem (proved in Problem 8.32) holds. 

THEOREM 8.8: (Laplace) The determinant of a square matrix A = [ad is equal to the sum of the 
products obtained by multiplying the elements of any row (column) by their respective 
cofactors: 

n 

|A| = a n Aj | + a a A a + • • • + a in A in = ^ ayAy 

7=1 

n 

|A| = flyAy + OyAy H-H a„jA„j = a iAij 

i= 1 


The above formulas for |A| are called the Laplace expansions of the determinant of A by the ith row 
and the jth column. Together with the elementary row (column) operations, they offer a method of 
simplifying the computation of |A|, as described below. 


8.8 Evaluation of Determinants 


The following algorithm reduces the evaluation of a determinant of order n to the evaluation of a 
determinant of order n — 1. 

ALGORITHM 8.1: (Reduction of the order of a determinant) The input is a nonzero n-square matrix 
A = [aJ with n > 1. 

Step 1. Choose an element a v . = 1 or, if lacking, 0. 

Step 2. Using ay as a pivot, apply elementary row (column) operations to put 0’s in all the other 
positions in the column (row) containing ay. 

Step 3. Expand the determinant by the column (row) containing ay. 
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The following remarks are in order. 


Remark 1: Algorithm 8.1 is usually used for determinants of order 4 or more. With determinants of 
order less than 4, one uses the specific formulas for the determinant. 


Remark 2: Gaussian elimination or, equivalently, repeated use of Algorithm 8.1 together with row 
interchanges can be used to transform a matrix A into an upper triangular matrix whose determinant is the 
product of its diagonal entries. However, one must keep track of the number of row interchanges, because 
each row interchange changes the sign of the determinant. 


EXAMPLE 8.9 Use Algorithm 8.1 to find the determinant of A 


5 4 2 1 

2 3 1-2 

-5 -7 -3 9 

1-2-1 4 


Use a 2 3 = 1 as a pivot to put 0’s in the other positions of the third column; that is, apply the row operations 
“Replace by —2 R 2 + R \,” “Replace R 3 by 3 R 2 + R 3 ,” and “Replace R 4 by R 2 + R 4 .” By Theorem 8.3(iii), the 
value of the determinant does not change under these operations. Thus, 


5 

4 

2 

1 


1 

-2 

0 

5 

2 

3 

1 

-2 


2 

3 

1 

-2 

-5 

-7 

-3 

9 


1 

2 

0 

3 

1 

-2 

-1 

4 


3 

1 

0 

2 


Now expand by the third column. Specifically, neglect all terms that contain 0 and use the fact that the sign of the 
minor M 13 is (—1 ) 2+3 = — 1. Thus, 




1 

2 

1 

3 


2 0 5 

3 1 -2 

2 0 3 

1 0 2 



1 

-2 

5 

= — 

1 

2 

3 


3 

1 

2 


-(4 - 18 + 5 - 30 - 3 + 4) = -(-38) = 38 


8.9 Classical Adjoint 

Let A = [ay] be an n x n matrix over a field K and let Ay denote the cofactor of ay. The classical adjoint of 
A, denoted by adj A, is the transpose of the matrix of cofactors of A. Namely, 

adj A = [Ay ] 7 


We say “classical adjoint” instead of simply “adjoint” because the term “adjoint” is currently used for an 
entirely different concept. 


EXAMPLE 8.10 Let A = 


2 3 
0 -4 
1 -1 


A n — + 
^21 — — 


A 3 i — 


-4 2 
-1 5 


= -IB, 


3 -4 
-1 5 

3 -4 
-4 2 


-4 

2 

5 


. The cofactors of the nine elements of A follow: 


— — 


0 2 
1 5 


= 2 , 


A l3 — + 





2 

-4 

- 14, 


- 11 , 

^22 — 

+ 

1 

5 

^23 — — 




2 

-4 



-io, 

a 32 = 

— 

0 

2 

= -4, 

a 33 = + 


0 -4 
1 -1 

2 3 

1 -1 

2 3 
0 -4 


= 4 

= 5 

= -8 
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The transpose of the above matrix of cofactors yields the classical adjoint of A; that is, 

— 18 -11 -10' 
adj A = 2 14 —4 

4 5 “ 8 _ 

The following theorem (proved in Problem 8.34) holds. 

THEOREM 8.9: Let A be any square matrix. Then 

A(adj A) = (adj A)A = |A|7 

where I is the identity matrix. Thus, if |A| p 0, 

^ 1 = j“(j ( ad J 

EXAMPLE 8.11 Let A be the matrix in Example 8.10. We have 

det(A) = -40 + 6 + 0 - 16 + 4 + 0 = -46 
Thus, A does have an inverse, and, by Theorem 8.9, 

r—is -ii -io 

A-' = -'-(adj A) = -— 2 14 -4 

A v J ’ 46 

4 5-8 

8.10 Applications to Linear Equations, Cramer's Rule 

Consider a system AX = B of n linear equations in n unknowns. Here A = [ay] is the (square) matrix of 
coefficients and B = [b;] is the column vector of constants. Let A, be the matrix obtained from A by 
replacing the 7th column of A by the column vector B. Furthermore, let 

D = det(A), N l — det(A,), N 2 = det(A 2 ), ..., N„ = detfAJ 
The fundamental relationship between determinants and the solution of the system AX = B follows. 

THEOREM 8.10: The (square) system AX = B has a solution if and only if Op 0. In this case, the 
unique solution is given by 

_Ni _K 

*i O’ * 2 O’ ” O 

The above theorem (proved in Problem 8.10) is known as Cramer’s rule for solving systems of linear 
equations. We emphasize that the theorem only refers to a system with the same number of equations as 
unknowns, and that it only gives the solution when Op 0. In fact, if D — 0, the theorem does not tell us 
whether or not the system has a solution. However, in the case of a homogeneous system, we have the 
following useful result (to be proved in Problem 8.54). 

THEOREM 8.11: A square homogeneous system AX = 0 has a nonzero solution if and only if 
D = |A| = 0. 


r 9 

11 

5 “I 

23 

46 

23 

1 

7 

2 

23 

23 

23 

2 

5 

4 

L 23 

46 

23 J 
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( x +- y + z — 5 

EXAMPLE 8.12 Solve the system using determinants < x— 2y — 3z — — 1 

I 2x + y — z = 3 


First compute the determinant D of the matrix of coefficients: 


D = 


1 1 1 

1 -2 -3 

2 1 -1 




Because D ^ 0, the system has a unique solution. To compute N x , N y , N z , we replace, respectively, the coefficients of 
x,y,z in the matrix of coefficients by the constant terms. This yields 



5 

1 

1 



1 

5 

1 



1 

1 

5 

N x = 

-1 

-2 

-3 

= 20, 

N } = 

1 

-1 

-3 

= -10, 

N z = 

1 

-2 

-1 


3 

1 

-1 


2 

3 

-1 



2 

1 

3 


Thus, the unique solution of the system is x = N x /D = 4, y = N y /D = —2, z = NjD = 3; that is, the 
vector u = (4, —2,3). 


8.11 Submatrices, Minors, Principal Minors 

Let A — [aJ be a square matrix of order n. Consider any r rows and r columns of A. That is, consider any 
set I — (*[, i 2 , ■ ■ ■, i r ) of r row indices and any set J = (j x ,j 2 , ■ ■ ■of r column indices. Then I and J 
define anrxr submatrix of A, denoted by A (7; J), obtained by deleting the rows and columns of A whose 
subscripts do not belong to / or J, respectively. That is, 

A(/;7) = [u 5? : s £ /, t G J\ 


The determinant |A(/;/)| is called a minor of A of order r and 


(- 1 ) 


H +'2H-t'r+il +/Tt-h ir 


W\j) I 


is the corresponding signed minor. (Note that a minor of order n — 1 is a minor in the sense of Section 8.7, 
and the corresponding signed minor is a cofactor.) Furthermore, if /' and J 1 denote, respectively, the 
remaining row and column indices, then 

W\j) I 


denotes the complementary minor , and its sign (Problem 8.74) is the same sign as the minor. 


EXAMPLE 8.13 Let A = [ay] be a 5-square matrix, and let / = {1,2,4} and J = {2,3,5}. Then 
/' = {3,5} and J' = {1,4}, and the corresponding minor \M\ and complementary minor \M'\ are as 
follows: 


\M\ = \A(I;J) | 


a n 

a 13 

a 15 

a 22 

a 23 

a 25 

a 42 

a 43 

a 45 


and 


\M'\ = |A(/';/)l 


#31 #34 

a 51 fl 54 


Because l + 2 + 4-(-2-(-3-(-5 = 17 is odd, —\M\ is the signed minor, and —\M'\ is the signed complementary 
minor. 


Principal Minors 

A minor is principal if the row and column indices are the same, or equivalently, if the diagonal elements 
of the minor come from the diagonal of the matrix. We note that the sign of a principal minor is always 
+ 1, because the sum of the row and identical column subscripts must always be even. 
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EXAMPLE 8.14 Let A = 


1 2 -1 
3 5 4 

-3 1 -2 

orders 1, 2, and 3, respectively. 

(a) There are three principal minors of order 1. These are 
111 — 1, |5|=5, | — 2| = —2, and so 

Note that C 3 is simply the trace of A. Namely, C l = tr(A). 


. Find the sums C,, C 2 , and C 3 of the principal minors of A of 


C, = 1+ 5 — 2 = 4 


(b) There are three ways to choose two of the three diagonal elements, and each choice gives a minor of order 2. 
These are 


1 2 
3 5 


= - 1 , 


1 -1 

-3 -2 


- 1 , 


5 4 

1 -2 


= -14 


(Note that these minors of order 2 are the cofactors A 33 , A 22 , and A u of A, respectively.) Thus, 

C 2 = —1 + 1 — 14 = -14 

(c) There is only one way to choose three of the three diagonal elements. Thus, the only minor of order 3 is the 
determinant of A itself. Thus, 

C 3 = |A| = -10 - 24 - 3 - 15 - 4 + 12 = -44 

8.12 Block Matrices and Determinants 


The following theorem (proved in Problem 8.36) is one main result of this section. 


THEOREM 8.12: Suppose M is an upper (lower) triangular block matrix with the diagonal blocks 
Ai,A 2 ,...,A n . Then 

det (M) = det(Aj) det(A 2 ).. . det(A„) 


2 

3 ' 

4 

7 


-1 

5 i 

3 

2 


0 

0 1 

2 

1 


0 

0 

3 

-1 

Z 

0 

o ! 

5 

2 



EXAMPLE 8.15 Find \M\ where M = 


Note that M is an upper triangular block matrix. Evaluate the determinant of each diagonal block: 

= -12 + 20 + 30 + 25 - 16 - 18 = 29 




2 

1 

5 

Z Z 

1 c 

= 10 + 3 = 13, 

3 

-1 

4 

— 1 J 


5 

2 

6 


Then \M\ = 13(29) = 377. 

Another result of this section follows: 


THEOREM 8.13: Consider the block matrix M = 


A B 
C D 

s x s. Then det(M) = det(A)det(£> — CA~ l B) 


Proof: Follows from the fact that M = 

8.13 Determinants and Volume 


1 

O' 

'A 

B 

CA- 1 

I 

0 

D - CA- l B 


where A is nonsingular, A is r x r and D is 


and Theorem 8.12. 


Determinants are related to the notions of area and volume as follows. Let u u u 2 ,, u n be vectors in R". 
Let S be the (solid) parallelopiped determined by the vectors; that is, 

S = {fliHi + a 2 u 2 + • • • + a n u n : 0 < a, < 1 for i — 1,..., n) 
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(When n = 2,5 is a parallelogram.) Let V(S) denote the volume of S (or area of S when n = 2). Then 
V(S) = absolute value of det (A) 

where A is the matrix with rows u 1 ,u 2 , ■ ■ ■ , n„. In general, l 7 (.S') — 0 if and only if the vectors m 1( ... ,u n do 
not form a coordinate system for R" (i.e., if and only if the vectors are linearly dependent). 

EXAMPLE 8.16 Let = (1,1, 0), u 2 — (1,1,1), n 3 — (0,2,3). Find the volume V(S) of the parallelo- 
piped S in R 3 (Fig. 8-2) determined by the three vectors. 



Evaluate the determinant of the matrix whose rows are u l , u 2 , t< 3 : 


1 1 0 
1 1 1 
0 2 3 


3+0+0—0—2—3=—2 


Hence, V(S) = | — 2| = 2. 

8.14 Determinant of a Linear Operator 


Let F be a linear operator on a vector space V with finite dimension. Let A be the matrix representation of 
F relative to some basis .S' of V. Then we define the determinant of F, written det(F), by 

det(F) = |A| 

If B were another matrix representation of F relative to another basis S' of V, then A and B are similar 
matrices (Theorem 6.7) and |Z?| = |A| (Theorem 8.7). In other words, the above definition det(F) is 
independent of the particular basis .S' of V. (We say that the definition is well defined.) 

The next theorem (to be proved in Problem 8.62) follows from analogous theorems on matrices. 


THEOREM 8.14: Let F and G be linear operators on a vector space V. Then 

(i) det(Fo G) — det(F) det(G). 

(ii) F is invertible if and only if det(F) f 0. 


EXAMPLE 8.17 Let F be the following linear operator on R 3 and let A be the matrix that represents F 
relative to the usual basis of R 3 : 


F(x, y, z ) = (2x — Ay + z, x - 2y + 3 z, 5x + y - z) 


and A = 


2 -4 1 

1 -2 3 

5 1 -1 


Then 


det(F) = |A| = 4 - 60 + 1 + 10 - 6 - 4 = -55 
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8.15 Multilinearity and Determinants 

Let V be a vector space over a field K. Let s/ = V"; that is, s'/ consists of all the n-tuples 

A = (A l ,A 2 ,... ,A n ) 

where the A, are vectors in V. The following definitions apply. 

DEFINITION 8.2: A function D: s/ s K is said to be multilinear if it is linear in each component: 

(i) If Aj = B + C, then 

D(A) = D(..., B + C, ...) = D(...,B,...,) + D(...,C,...) 

(ii) If Aj = kB, where k G K, then 

D(A) = D(...,kB,...) = kD(...,B,...) 

We also say n-linear for multilinear if there are n components. 

DEFINITION 8.3: A function D: s/ —> K is said to be alternating if D(A) = 0 whenever A has two 
identical elements: 

D(A i ,A 2 , ... ,A„) = 0 whenever A f = Aj, i 

Now let M denote the set of all n-square matrices A over a field K. We may view A as an /i-tuple 
consisting of its row vectors A 1 ,A 2 ,... ,A n ; that is, we may view A in the form A = (A l ,A 2 ,... ,A n ). 

The following theorem (proved in Problem 8.37) characterizes the determinant function. 

THEOREM 8.15: There exists a unique function D:M —>■ K such that 

(i) D is multilinear, (ii) D is alternating, (iii) D(I) = 1. 

This function D is the determinant function; that is, D(A) = |/\|. for any matrix 
AeM. 


SOLVED PROBLEMS 


Computation of Determinants 

8 . 1 . Evaluate the determinant of each of the following matrices: 


(a) A — 


'6 5' 
2 3 _ 

,(b) 5- 

'2 -3' 
4 7 

,(c) C = 

4 -5' 
-1 -2 

, (d) D = 

t — 5 6 

3 t + 2_ 


Use the formula 


a b 
c d 


= ad — be. 


(a) |A| = 6(3) - 5(2) = 18 - 10 = 8 

(b) \B\ = 14 + 12 = 26 

(c) |C| = —8 — 5 = -13 

(d) |D| = (t - 5 )(t + 2) - 18 = ? 2 - 3t - 10 - 18 = t 1 - lOf - 28 


8 . 2 . Evaluate the determinant of each of the following matrices: 


II 

'2 3 4' 
5 4 3 

,(b) B = 

A -2 3' 

2 4-1 

,(c) C = 

'1 3 -5' 

3-1 2 


(N 


1 5 -2 


1 -2 1 
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Use the diagram in Fig. 8-1 to obtain the six products: 

(a) |A| = 2(4)(1) + 3(3) (1) + 4(2)(5) - 1(4)(4) - 2(3)(2) - 1(3)(5) = 8 + 9 + 40 - 16 - 12 -15 = 14 

(b) |B| = -8 + 2 + 30 - 12 + 5 - 8 = 9 

(c) |C| = —1 + 6 + 30 —5 + 4 —9 = 25 


8 . 3 . Compute the determinant of each of the following matrices: 


(a) A — 


(b) B = 


4 -6 
0 -2 
0 0 
0 0 


9 

-3 

6 

3 


(c) C = 


-1 - 


1 -4 


t 

3 

-1 

1 


(a) One can simplify the entries by first subtracting twice the first row from the second row—that is, by 
applying the row operation “Replace R 2 by -2, + /+.” Then 


= 0 - 24 + 36 - 0 + 18 - 3 = 27 


2 

3 

4 


2 

3 

4 

5 

6 

7 

= 

1 

0 

-1 

8 

9 

1 


8 

9 

1 


(b) B is triangular, so |B| = product of the diagonal entries = —120. 

(c) The arithmetic is simpler if fractions are first eliminated. Hence, multiply the first row R t by 6 and the 
second row R 2 by 4. Then 

28 7 

= 6 + 24 + 24 + 4-48+ 18 = 28, so |C| = — = - 

24 6 


I24CI = 


3 -6 -2 
3 2-4 

1 -4 1 


8 . 4 . Compute the determinant of each of the following matrices: 


(a) A 



5 

-3 

3 

-6 


-3 

2 

-2 

4 




2 10 5 

11-21 

12-23 
0 2 3 -1 

-1-3 4 2 


(a) Use a 31 = 1 as a pivot to put 0’s in the first column, by applying the row operations “Replace R l by 
— 2R 3 +/?!,” “Replace R 2 by 2 R 3 — R 2 ,” and “Replace R 4 by /+ +R 4 .” Then 


2 

-2 

5 

-3 

-3 

2 

-2 

-5 


0 

0 

-1 

3 

1 

-2 

-6 

-1 


-1 

3 

1 

-6 

-1 

1 

3 

-2 

2 


1 

3 

-2 

2 


—2 

-1 

-6 

4 

3 


0 

-3 

2 

5 


-3 

2 

5 


= 10 + 3 - 36 + 36 - 2 - 15 = -4 


(b) First reduce |B| to a determinant of order 4, and then to a determinant of order 3, for which we can use 
Fig. 8-1. First use c 22 = 1 as a pivot to put 0’s in the second column, by applying the row operations 


‘Replace R\ by 

-2 R 2 +R 

i,” “Replace R 


2 0 

-1 

4 3 


2 1 

1 

-2 1 

|£| = 

-1 0 

1 

0 2 


3 0 

2 

3 -1 


1 0 

-2 

2 3 


1 4 

5 


= 

5 3 

-5 

= 21 +2C 


-1 2 

7 



by 

-r 2 

+ R 3 , 

” and 

“Replace R 5 

by R 2 + R 

5 -” 


2 

-1 

4 

3 


1 

1 

4 

5 


-1 

1 

0 

2 


0 

1 

0 

0 

= 

3 

2 

3 - 

-1 


5 

2 

3 - 

-5 


1 

-2 

2 

3 


-1 

-2 

2 

7 


-34 
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Cofactors, Classical Adjoints, Minors, Principal Minors 


8 . 5 . Let A = 


2 1-34 

5-4 7-2 

4 0 6 -3 

3-252 


(a) Find A 23 , the cofactor (signed minor) of 7 in A. 

(b) Find the minor and the signed minor of the submatrix M = A(2,4; 2,3). 

(c) Find the principal minor determined by the first and third diagonal entries—that is, by 

M = A( 1,3; 1,3). 

(a) Take the determinant of the submatrix of A obtained by deleting row 2 and column 3 (those which contain 

\ 2+3 


the 7), and multiply the determinant by (—1) 

12 1 


^23 — — 


4 

4 0-3 

3-2 2 


= -(-61) = 61 


The exponent 2 + 3 comes from the subscripts of A 23 —that is, from the fact that 7 appears in row 2 and 
column 3. 

(b) The row subscripts are 2 and 4 and the column subscripts are 2 and 3. Hence, the minor is the 
determinant 


\M\ = 


a 22 a 23 

a 42 fl 43 
2 + 43 - 2+3 


—4 7 
-2 5 


and the signed minor is (—1) 

(c) The principal minor is the determinant 

\M\ = 


= -20+14= -6 
\M\ = —\M\ = -(-6) = 6. 

= 12 + 12 = 24 


An a 13 


2 

-3 

fl 3 i a 33 


4 

6 


Note that now the diagonal entries of the submatrix are diagonal entries of the original matrix. Also, the 
sign of the principal minor is positive. 


8 . 6 . Let B = 


. Find: (a) |B|, (b) adj B, (c) B 1 2 using adj B. 


1 1 1 

2 3 4 

5 8 9 

(a) |£j = 27 + 20 + 16 - 15 - 32 - 18 = -2 

(b) Take the transpose of the matrix of cofactors: 


adj B = 



3 

4 


2 

4 


2 

3 



8 

9 


5 

9 


5 

8 



1 

1 


1 

1 


1 

1 



8 

9 


5 

9 


5 

8 



1 

1 


1 

1 


1 

1 



3 

4 


2 

4 


2 

3 



"-5 

2 

r 

T 

"-5 

-1 

r 

-1 

4 

-3 

= 

2 

4 

-2 

1 

-2 

1 


1 

-3 

1 


(c) Because |Bj ^ 0, B 1 = j— (adj B) = 


-2 


"-5 

2 

-1 

4 

r 

-2 


" 5 

2 

-1 

1 1 “ 

2 2 

-2 1 

1 

-3 

1 


1 

L 2 

3 1 

2 2- 


8 . 7 . Let A = 


1 2 3 
4 5 6 
0 7 8 


, and let S k denote the sum of its principal minors of order k. Find S k for 


(a) k — l, (b) k = 2, (c) k = 3. 
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(a) The principal minors of order 1 are the diagonal elements. Thus, Sj is the trace of A; that is, 

S l = tr(A) = 1 + 5 + 8 = 14 

(b) The principal minors of order 2 are the cofactors of the diagonal elements. Thus, 


S 2 — A u +A 2 2 + A 33 — 


5 

7 


+ 


1 3 

0 8 



2 

5 


=-2+ 8-3 = 3 


(c) There is only one principal minor of order 3, the determinant of A. Then 

S 3 = |A| = 40 + 0 + 84 — 0 — 42 — 64 = 18 


8 . 8 . 


Let A = 


1 

-4 

1 

3 


(a) k = 1 , 


3 

0 

-1 

2 

5 

1 

0 

3 

-2 

-2 

1 

4 

(b) 

k = 

2 , 


. Find the number N k and sum S k of principal minors of order: 
(c) k = 3, (d) k = 4. 


Each (nonempty) subset of the diagonal (or equivalently, each nonempty subset of {1,2,3,4}) 


determines a principal minor of A, and N k = 


k\(n — k)\ 


of them are of order k. 


Thus, N l = 


= 4. 


Nn = 


= 6 , 


N 3 = 


= 4, 


N d = 


= 1 


(a) .S’, = 

(b) S 2 = 


(c) S 3 = 


1| + |2| + |3| + |4|= 1 +2 + 3 + 4= 10 


1 3 

+ 

1 

0 

+ 

1 

-1 

+ 

2 

5 

+ 


2 

1 

+ 

3 

-2 

-4 2 

1 

3 


3 

4 


0 

3 


— 

2 

4 


1 

4 

14 + 3 + 7 + 6+10+14 

= 54 











1 3 

0 



1 

3 

-1 


1 

0 

- 

1 



2 

5 

1 

-4 2 

5 

+ 


4 

2 

1 

+ 

1 

3 

- 

■2 

+ 


0 

3 

-2 

1 0 

3 



3 ■ 

-2 

4 


3 

1 


4 


- 

-2 

1 

4 


= 57 + 65 + 22 + 54 = 198 

(d) S 4 = det(A) = 378 


Determinants and Systems of Linear Equations 

(3y + 2x = z+l 

8 . 9 . Use determinants to solve the system < 3x + 2" = 8 — 5y 

1 3z — 1 = x — 2y 


First arrange the equation in standard form, then compute the determinant D of the matrix of coefficients: 


2r i 3v -• z — 1 

2>x + 5y + 2z= 8 

x-2y-3z = -1 


and 


D = 


2 3-1 

3 5 2 

1 -2 -3 


-30 + 6 + 6 + 5 + 8 + 27 = 22 


Because D ^ 0, the system has a unique solution. To compute N x ,N y ,N z , we replace, respectively, the 
coefficients of jc, _v, z in the matrix of coefficients by the constant terms. Then 



1 3 -1 


2 

1 -1 


2 3 1 

N x = 

8 5 2 

-1 -2 -1 

= 66, N y = 

3 

1 

8 2 

-1 -3 

= -22, N z = 

3 5 8 

1 -2-1 
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N 66 

X ~ 15 ~ 22 - ~ ’ 


N y -22 
y ~D~ 22 ~ 


z = — = — = 2 
D 22 


f fa- + y + z = 1 

8 . 10 . Consider the system < x + &y + z = 1 

( x + y + kz — 1 

Use determinants to find those values of k for which the system has 
(a) a unique solution, (b) more than one solution, (c) no solution. 

(a) The system has a unique solution when D ■=/=■ 0, where D is the determinant of the matrix of coefficients. 
Compute 

k 1 1 

D= 1 k 1 = k 3, + 1 + 1 - k-k-k = k 3, - 3k+ 2 = (k - lf(k + 2) 

1 1 k 

Thus, the system has a unique solution when 

(k — 1 ) 2 (k + 2) ^ 0, when k ^ 1 and k ^ 2 

(b and c) Gaussian elimination shows that the system has more than one solution when k = 1, and the system 
has no solution when k = —2. 

Miscellaneous Problems 

8 . 11 . Find the volume V(S) of the parallelepiped S in R 3 determined by the vectors: 

(a) Mi = (1,1,1 ),« 2 = (1,3, —4), m 3 = (1,2,-5). 

(b) « 1 = (1,2,4 ),« 2 = (2,1,-3),m 3 = (5,7,9). 

V(S) is the absolute value of the determinant of the matrix M whose rows are the given vectors. Thus, 




1 

1 

1 

(a) 

\M\ = 

1 

3 

-4 



1 

2 

-5 



1 

2 

4 

(b) 

\M\ = 

2 

1 

-3 



5 

7 

9 


lie in a plane and are linearly dependent. 


8.12. Find det (M) where M — 


3 

4 

0 

0 

0 " 


'3 

4 i 0 

' 0 

0 ‘ 

2 

5 

0 

0 

0 


2 

5 { 0 

[o 

0 

0 

9 

2 

0 

0 

= 

0 

9 i 2 

1 0 

0 

0 

5 

0 

6 

7 


0 

5 | 0 

[6 

1 

0 

0 

4 

3 

4 


0 

0 i 4 

i 3 

4 


M is a (lower) triangulai' block matrix; hence, evaluate the determinant of each diagonal block: 


= 15-8 = 7, 


121 = 2 , 


= 24-21 = 3 


Thus, \M\ = 7(2)(3) = 42. 

8.13. Find the determinant of F: R 3 — R 3 defined by 

F(x, y, z ) = (x + 3y - 4z, 2y + 7z, x + 5 y - 3 z) 
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The determinant of a linear operator F is equal to the determinant of any matrix that represents F. Thus 
first find the matrix A representing F in the usual basis (whose rows, respectively, consist of the coefficients of 
x , y, z). Then 


A = 


1 3 -4 
0 2 7 

1 5 -3 


and so det(F) = |A| = —6 + 21+0+8 — 35 — 0 = —8 


8 . 14 . Write out g = g(x 1 ,x 2 ,x 3 ,x 4 ) explicitly where 


g(x l ,x 2 ,...,x n ) = n^i-xj). 

i<j 

The symbol is used for a product of terms in the same way that the symbol is used for a sum of 
terms. That is, n,</ (+ — x j) means the product of all terms (jt ; — Xj) for which i < j. Hence, 

8 = g( x t, • • ■ ,x 4 ) = (*i - x 2 ){x l - x 3 )(xi - x 4 ){x 2 - x 3 )(x 2 - x 4 )(x 3 - x 4 ) 

8.15. Let D be a 2-linear, alternating function. Show that D(A,B) = -D(B.A). 

Because D is alternating, D(A , A) = 0, DtB. B) = 0. Hence, 


D(A + B,A+B) =D{A,A) +D(A,B) +D(B,A) +D(B,B) = D(A,B) + D(B.A) 


However, D(A + B, A + B) = 0. Hence, D(A. B) = —D(B,A), as required. 


Permutations 


8 . 16 . Determine the parity (sign) of the permutation a = 364152. 

Count the number of inversions. That is, for each element k, count the number of elements i in a such that 
i > k and i precedes k in a. Namely, 


k = 1 : 

3 numbers (3, 6 ,4) 

A-= 4: 

1 number ( 6 ) 

k = 2: 

4 numbers (3, 6 ,4, 5) 

k = 5: 

1 number ( 6 ) 

k=3: 

0 numbers 

k = 6 : 

0 numbers 


Because 3 + 4 + 0+ l + l+ 0 = 9is odd, a is an odd permutation, and sgn a = — 1. 


8 . 17 . Let a = 24513 and r = 41352 be permutations in S 5 . Find (a) t ° a, (b) a l . 

Recall that a = 24513 and r = 41352 are short ways of writing 

ft=Q ^ ^ ^ g) or a W = 2 ’ ff ( 2 )= 4 . cr(3) = 5, ff(4) = 1, <r( 5) = 3 


n 2 3 4 5 \ 

T= y 4 l 3 5 2 J or T ( 1 ) = 4 ^ t (2) =1 - t (3) = 3, t(4) = 5, t(5) = 2 


(a) The effects of a and then ton 1,2,3,4, 5 are as follows: 

1 —> 2 —> 1, 2^4^ 5, 3 —^ 5 —^ 2, 4^1^ 4, 5^3^3 

[That is, for example, (to <r)(l) = r(<j(l)) = t( 2) = 1.] Thus, z° a = 15243. 

(b) By definition, ff _ 1 (y) = k if and only if a{k) = j. Hence, 


(2 4 

5 

1 3 \ 

( 1 

2 3 

4 

5 | 

ll 2 

3 

II 

in 

u 

1 5 

2 

3 J 


or ff- 1 = 41523 


8 . 18 . Let a = j | j 2 ■ ■ ■ ./„ be any permutation in S n . Show that, for each inversion (i, k) where i > k but i 
precedes k in a, there is a pair ( i*,j *) such that 

i* < k* and a(i*) > <r{j*) (1) 

and vice versa. Thus, a is even or odd according to whether there is an even or an odd number of 
pairs satisfying (1). 
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Choose i* and k* so that <j (/*) = i and a(k*) = k. Then i > k if and only if <r(i*) > a(k*), and i 
precedes k in a if and only if i* < k*. 

8.19. Consider the polynomials g = g(x x ,... ,x„) and <r(g), defined by 

S = g(*l, ■ • • ,X„) = life - x j) and a (g) = IK x a(i) - x a{J)) 
i<j i<j 

(See Problem 8.14.) Show that a(g) = g when a is an even permutation, and a(g) — —g when a is 
an odd permutation. That is, o(g) = (sgn a)g. 

Because a is one-to-one and onto, 

^(g) = n(*„« - x aU) ) = n ( x i - xj) 

i<j i<j or i>j 

Thus, a(g) or a{g) = — g according to whether there is an even or an odd number of terms of the form jc,- — Xj, 
where i > j. Note that for each pair (;,/) for which 

i<j and o{i) > a{j) 

there is a term (x a ( l) — x a(]> ) in a(g) for which a(i) > a( j). Because a is even if and only if there is an even 
number of pairs satisfying (1), we have a(g) = g if and only if a is even. Hence, a (g) = —g if and only if a 
is odd. 


8 . 20 . Let cr, t £ S n . Show that sgn(t° a) = (sgn T)(sgn a). Thus, the product of two even or two odd 
permutations is even, and the product of an odd and an even permutation is odd. 

Using Problem 8.19, we have 

sgn(r° a) g = (r°a)(g) = t(o(g)) = r((sgn a)g) = (sgn r)(sgn a)g 
Accordingly, sgn (t° a) = (sgn r)(sgn a). 

8 . 21 . Consider the permutation a = j l j 2 ■ ■ ■ j n . Show that sgn o ^ 1 = sgn a and, for scalars a ir show 
that 


a jil a j 2 2 ' ' ' a j„n — a lk l a 2k 1 ' ' ' a nk„ 

where o^ 1 = k l k 2 ■ ■ ■ k n . 

We have cr _1 ° a = £, the identity permutation. Because e is even, cr _l and a are both even or both odd. 
Hence sgn a~ l = sgn a. 

Because a =jj 2 ■ ■ j n is a permutation, a hl a j22 ■ ■ ■ a jn „ = a lk a 2kl ■ ■ ■ a nK . Then k u k 2 ,...,k n have the 
property that 


<t(&i) = 1, a(k 2 ) = 2, ..., a(k n ) = n 

Let t = k x k 2 ■ ■ ■ k n . Then, for i = 1 

( ff o T )(i) = <7 (t(0) = <j(kj) = i 
Thus, cr° r = e, the identity permutation. Hence, i = cr _l . 

Proofs of Theorems 

8.22. Prove Theorem 8.1: \A T \ = |A|. 

If A = [ay], then A T = [by], with by = ajj. Hence, 

M r | = E (sgn a)b\a(l)b 2 c(2) • • • b na(n) = E (sgn ff )« ff (l),l«<r( 2),2 • • • a o( n ),n 

<reS n a€.S' n 

Let t = a~K By Problem 8.21 sgn r = sgn a, and a a(1) l a a{2l2 ■ ■ ■ a a(nj n = a u(1) a 2<2) ■ ■ ■ a m(n] . Hence, 

\A T | = E (^n r) a lt( l) a 2r(2) ’ ’ ’ a m(n) 
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However, as a inns through all the elements of S n , t = a 1 also runs through all the elements of S n . Thus, 

|A r | = |A|. 


8.23. Prove Theorem 8.3(i): If two rows (columns) of A are interchanged, then \B\ = — |A|. 

We prove the theorem for the case that two columns are interchanged. Let t be the transposition that 
interchanges the two numbers corresponding to the two columns of A that are interchanged. If A = [a^] and 
B = [bjj\, then by = ci ir (j)- Hence, for any permutation a, 

b\a(\)bla{2) ' ' ' ^nff(n) = a \(t° o)(l) a 2(z° a)(2) ' ' ' a n(%° a)(n) 


Thus, 


\B\ = J2 (sgn ff ) b ia(i) b 2a(2) " = E (sgn ff )«i(to ff )(i)«2 (t°o-)(2) ’' ‘ 

(JES n 


Because the transposition x is an odd permutation, sgn(r° a) = (sgn r)(sgn a) = —sgn a. Accordingly, 
sgn a = —sgn (t° a), and so 

\ B \ = ~ E [ s g n 0 o °’)] a l(Tt> (r)(l) fl 2(T° a)(2) ' ' ' a)(n) 

oeS„ 

But as a runs through all the elements of S n , x ° a also runs through all the elements of S n . Hence, |Bj = — |A|. 

8 . 24 . Prove Theorem 8.2. 

(i) If A has a row (column) of zeros, then |A| =0. 

(ii) If A has two identical rows (columns), then |A| =0. 

(iii) If A is triangular, then |A| = product of diagonal elements. Thus, /| = 1. 

(i) Each term in |A| contains a factor from every row, and so from the row of zeros. Thus, each term of \A\ 
is zero, and so |A| = 0. 

(ii) Suppose 1 + 1 ^ 0 in K. If we interchange the two identical rows of A, we still obtain the matrix A. 
Hence, by Problem 8.23, |A| = —|A|, and so |A| = 0. 

Now suppose 1 + 1 = 0 in K. Then sgn a = 1 for every a G S n . Because A has two identical 
rows, we can arrange the terms of A into pairs of equal terms. Because each pair is 0, the determinant of 
A is zero. 

(iii) Suppose A = [a t] \ is lower triangular; that is, the entries above the diagonal are all zero: a t j = 0 
whenever i < j. Consider a term t of the determinant of A: 

t = (sgn o)a u a 2li ■ • • a nin , where a = f, i 2 ■ ■ ■ i n 

Suppose i| 7^1. Then 1 < i 1 and so a Ui = 0; hence, t = 0. That is, each term for which q ^ 1 is 

zero. 

Now suppose t] = 1 but i 2 2. Then 2 < i 2 , and so a 2u = 0; hence, t = 0. Thus, each term 
for which i l ^ 1 or i 2 i=- 2 is zero. 

Similarly, we obtain that each term for which i 1 ^1 or i 2 ^ 2 or ... or i n ^ n is zero. 
Accordingly, |A| = a n a 22 " • a, m = product of diagonal elements. 

8 . 25 . Prove Theorem 8.3: B is obtained from A by an elementary operation. 

(i) If two rows (columns) of A were interchanged, then />’ = — |A|. 

(ii) If a row (column) of A were multiplied by a scalar k, then \B\ = k\A\. 

(iii) If a multiple of a row (column) of A were added to another row (column) of A, then \B\ = |A|. 

(i) This result was proved in Problem 8.23. 

(ii) If the jth row of A is multiplied by k , then every term in |A| is multiplied by k , and so |B| = k\A\. That is, 

l s l = E ( s g n a ) a u l a 2 i 2 ■'' ( ka jij ) • • • «m„ = k E (sgn o)a u a 2il ■ ■ ■ a„ in = k\A\ 
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(iii) Suppose c times the Ath row is added to the y'th row of A. Using the symbol ' to denote the jth position in 
a determinant term, we have 

|£| = J2 (sgn <j)a u a 2ii ■ ■ ■ ( ca kit + a... a< 

a 

= C E (sgn a ) a u l a 2i 1 ■ ■ ■ aid k ■ " <*m n + E (sgn ■ ■ ■ a fi . ■ ■ ■ a nin 

O O 

The first sum is the determinant of a matrix whose kth and yth rows are identical. Accordingly, by 
Theorem 8.2(ii), the sum is zero. The second sum is the determinant of A. Thus, |B| = c ■ 0 + |A| = |A|. 

8 . 26 . Prove Lemma 8.6: Let E be an elementary matrix. Then EA = |£||A|. 

Consider the elementary row operations: (i) Multiply a row by a constant k ^ 0, 

(ii) Interchange two rows, (iii) Add a multiple of one row to another. 

Let E l ,E 2 ,E 3 be the corresponding elementary matrices That is, E\ ,E 2 ,E 3 are obtained by applying the above 
operations to the identity matrix I. By Problem 8.25, 

|£i| = *|/|=*, |£ 2 | = -|/| = -i. N = l'l = i 

Recall (Theorem 3.11) that E,A is identical to the matrix obtained by applying the corresponding operation 
to A. Thus, by Theorem 8.3, we obtain the following which proves our lemma: 

\E l A\ = k\A\=\E l \\A\, \E 2 A\ = —|A| = \E 2 \\A\ 3 \E 3 A\ = |A| = 1|A| = |£ 3 ||A| 

8 . 27 . Suppose B is row equivalent to a square matrix A. Prove that \B\ = 0 if and only if |A| = 0. 

By Theorem 8.3, the effect of an elementary row operation is to change the sign of the determinant or to 
multiply the determinant by a nonzero scalar. Hence, |Bj = 0 if and only if |A| = 0. 

8 . 28 . Prove Theorem 8.5: Let A be an n-square matrix. Then the following are equivalent: 

(i) A is invertible, (ii) AX = 0 has only the zero solution, (iii) det(A) 0. 

The proof is by the Gaussian algorithm. If A is invertible, it is row equivalent to 1. But |/| ^ 0. Hence, by 
Problem 8.27, |A| ^ 0. If A is not invertible, it is row equivalent to a matrix with a zero row. Hence, 
det(A) = 0. Thus, (i) and (iii) are equivalent. 

If AX = 0 has only the solution X = 0, then A is row equivalent to I and A is invertible. Conversely, if A 
is invertible with inverse A -1 , then 

X = IX = (A~ l A)X = A -1 (AX') = A _1 0 = 0 
is the only solution of AX = 0. Thus, (i) and (ii) are equivalent. 

8 . 29 . Prove Theorem 8.4: \AB\ = |A||B|. 

If A is singular, then AB is also singular, and so |AB| = 0 = |A| |Bj. On the other hand, if A is nonsingular, 
then A = E n - ■ ■ E 2 E l , a product of elementary matrices. Then, Lemma 8.6 and induction yields 

jA£| = | E n ■ ■ ■E 2 E l B\ = \E n \ ■ • • |£ 2 ||E 1 ||8| = |A||B| 

8 . 30 . Suppose P is invertible. Prove that |P _1 | = |P| *• 

P~ 1 P = I. Hence, 1 = |/| = |P _1 P| = |P _1 ||P|,and so |P _1 | = |P| _1 . 

8 . 31 . Prove Theorem 8.7: Suppose A and B are similar matrices. Then |A| = \B\. 

Because A and B are similar, there exists an invertible matrix P such that B = P 1 /l P. Therefore, using 
Problem 8.30, we get |B| = |p-'AP| = I/ 3 * 1 1|A| 1^1 = |A||P“ 1 ||P= |A|. 

We remark that although the matrices and A may not commute, their determinants |P~’ | and |A| do 
commute, because they are scalars in the field K 

8 . 32 . Prove Theorem 8.8 (Laplace): Let A = [aA , and let Ay denote the cofactor of a tJ . Then, for any i or j 

\A\ = a n A n + ■ • • + a in A in and \A\ = a Xj A Xj + • • ■ + a nj A n j 
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Because |A| = |A r |, we need only prove one of the expansions, say, the first one in terms of rows of A. 
Each term in |A| contains one and only one entry of the zth row (a n , a ;2 ,..., a,„) of A. Hence, we can write |A| 
in the form 

Ml = a n Af j + a i2 Af 2 + • • • + a in A% 

(Note that A| is a sum of terms involving no entry of the zth row of A.) Thus, the theorem is proved if we can 
show that 

A*=A 0 = (-iy*\M v \ 

where My is the matrix obtained by deleting the row and column containing the entry ay. (Historically, the 
expression A* was defined as the cofactor of ay, and so the theorem reduces to showing that the two 
definitions of the cofactor are equivalent.) 

First we consider the case that i = n, j = n. Then the sum of terms in |A| containing a, m is 

a nn A f,„ = a nn E( S g n ff )«l<7(l)«2cr(2) ' ' ' a n-\,a(n-\) 
a 

where we sum over all permutations a G S n for which a(n) = n. However, this is equivalent (Prove!) to 
summing over all permutations of {1,— 1}. Thus, A*„ = \M nn \ = (—l) n+n \M lm \. 

Now we consider any i and j. We interchange the zth row with each succeeding row until it is last, and we 
interchange the jth column with each succeeding column until it is last. Note that the determinant \My\ is not 
affected, because the relative positions of the other rows and columns are not affected by these interchanges. 
However, the “sign” of |A| and of A* is changed n — 1 and then n — j times. Accordingly, 

Ay = (-D n - i+ ' H \My\ = (-l) i+j \My\ 

8 . 33 . Let A = [aJ and let B be the matrix obtained from A by replacing the zth row of A by the row vector 
(b n ,..., b in ). Show that 

Ml = b a A n + b i2 A i2 H-b b in A in 

Furthermore, show that, for j M i, 

a j\A,\ + QpA a + • • • + ci jn A in — 0 and jA u + a 2j A 2i + • • ■ + a nj A ni = 0 
Let B = [by]. By Theorem 8 . 8 , 

Ml = b n B a + b i2 B i2 + -h b in B in 

Because By does not depend on the zth row of B. we get By = Ay for j = 1..... zz. Hence, 

Ml = b n A n + b j2 A i2 +-1- b in Aj n 

Now let A' be obtained from A by replacing the zth row of A by the jth row of A. Because A' has two 
identical rows, |A'| = 0. Thus, by the above result, 

Ml = a j\Aa + a j2 A i2 -f-b a j„A in = 0 

Using |A r | = |A|, we also obtain that a tj A t , + a 2 jA 2i + • • • + a,yA ni = 0. 

8 . 34 . Prove Theorem 8.9: A(adj A) — (adj A)A = \A\I. 

Let A = [tty and let A(adj A) = [by . The zth row of A is 

(a t i, a, 2 , ...,a in ) (1) 

Because adj A is the transpose of the matrix of cofactors, the jth column of adj A is the tranpose of the 
cofactors of the jth row of A: 

(Aj,Ay 2l ... ,Ay n ) 

Now by, the zj entry in A(adj A), is obtained by multiplying expressions (1) and (2): 

b ij = a i\Aji + ct i2 Aj 2 H b a in Aj„ 


( 2 ) 
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By Theorem 8.8 and Problem 8.33, 


|A| if i=j 
0 if i^j 


Accordingly, A(adj A) is the diagonal matrix with each diagonal element |A|. In other words, A(adj A) = |A|7. 
Similarly, (adj A)A = |A|7. 


8 . 35 . Prove Theorem 8.10 (Cramer’s rule): The (square) system AX = B has a unique solution if and only 
if 77 7 ^ 0. In this case, x i = NjD for each i. 

By previous results, AX = B has a unique solution if and only if A is invertible, and A is invertible if and 
only if D = |A| ^ 0. 

Now suppose D ^ 0. By Theorem 8.9, A -1 = (1/D)(adj A). Multiplying AX = B by A -1 , we obtain 

X = A~'AX= (1/77) (adj A)B (1) 

Note that the 7th row of (1 /7D)(adj A) is (1/77)(A U , A 2i ,... ,A nj ). If B = (b l ,b 2 , ■ ■ ■ ,7>„) r , then, by (1), 

x i — (^/ D )(b l A u + b 2 A 2i +-1- b n A ni ) 

However, as in Problem 8.33, b x A h + b 2 A 2i + • • • + b n A ni = N h the determinant of the matrix obtained by 
replacing the 7th column of A by the column vector B. Thus, x t = (1 /77)A7, as required. 


8 . 36 . Prove Theorem 8.12: Suppose M is an upper (lower) triangular block matrix with diagonal blocks 
Ai,A 2) . .. ,A„. Then 

det(M) = det(Aj) det(A 2 ) ■ ■ ■ det(A„) 


We need only prove the theorem for n = 2—that is, when M is a square matrix of the fomi 


M = 


A C 
0 B 


. The proof of the general theorem follows easily by induction. 


Suppose A = [aA is r-square, B = [bJ is s-square, and M = [m„] is n-square, where n = r + s. By 


definition, 


det(M) = E (sgn a) m la(l) m 2a{2) ' ' ' m na(n) 
a€S„ 


If i > r and j < r, then my = 0. Thus, we need only consider those permutations a such that 

o{r+l,r + 2,...,r + s} = {r+\,r + 2,...,r + s} and cr{ 1 ,2 ,..., r} = (1,2,... ,r} 

Let (7j (k) = o(k) for k < r, and let a 2 (k) = a(r + k) — r for k < s. Then 

(sgn <j)m la{l) m M2) ■ ■■m na(n) = (sgn ffi)ai ffl( i ) a 2 cr 1 ( 2 ) • • • « roi (r)(sgn 02 ) b \a 2 (\)b 2 a 2 (i) ‘ • • b sa 2 ( s ) 
which implies det(M) = det(A) det(B). 

8.37. Prove Theorem 8.15: There exists a unique function D : M —> K such that 

(i) D is multilinear, (ii) D is alternating, (iii) 77(7) = 1. 

This function D is the determinant function; that is, 77(A) = |A|. 

Let D be the determinant function, 77(A) = |A|. We must show that D satisfies (i), (ii), and (iii), and that 
77 is the only function satisfying (i), (ii), and (iii). 

By Theorem 8.2, 77 satisfies (ii) and (iii). Hence, we show that it is multilinear. Suppose the 7th row of 
A = [ay] has the form (b n + c iU b i2 + c i2 , ..., b in + c,„). Then 

D(A)=D(A U B i + q, ..., AJ 

= E ( s g n <= r )«t a -(i) • • ■ «i-i. ff (i-i)(E(i) + <7x(o) • • • a n „( n) 

S n 

= E (sgn Cr )«i a -(1) • • ■ b io(i) ■ ■ ■ a rn{n) + E (sg n a ) a \od) ■ ■ ■ c,<7(«) • • • a na{n) 

= D(A U ... ,B h ... ,A„) + D(A ,,..., Cj ,... ,A„) 
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Also, by Theorem 8.3(ii), 

D(A^ ,..., kAj : ... ,A f] ) . : Aj,... ,A n ) 

Thus, D is multilinear— D satisfies (i). 

We next must prove the uniqueness of D. Suppose D satisfies (i), (ii), and (iii). If {e x ,... ,e n } is the 
usual basis of K", then, by (iii), D(e 1 ,e 2 , • ■ •, e„) = D(I) = 1. Using (ii), we also have that 

D (ei 1 ,e i2 ,..., e i n ) = sgno, where o = i l i 2 ---i n (1) 

Now suppose A = [a y ]. Observe that the £th row A k of A is 

A k = ( a kl > a kh • • • > a kn ) = a k\ e \ + a k2 e 2 3-b a kii e n 


Thus, 

D(A) = D(a n e l + - \-a ln e n , a 2 l e 1 -b a 2 n e m a n\ e \ 3-b a „„ e „) 

Using the multilinearity of D, we can write D{A ) as a sum of terms of the form 

D{A) = Y. D ( a u i e i l ,a lh e h ,.. 

= E(«Uj« 2, 2 • ■■a nin )D(e h ,e i2 ,... ,e in ) (2) 

where the sum is summed over all sequences i t i 2 ■ ■ ■ i n , where i k G {1,If two of the indices are equal, 
say ij = i k but j k, then, by (ii), 

?i ] ■ ?i 2 ■,■■■■ €i n ) 0 

Accordingly, the sum in (2) need only be summed over all permutations a = i { i 2 ■ ■ ■ i n . Using (1), we finally 
have that 


D ( A ) = E ( a Ui a 2; 2 • •' a nin )D(e h ,e h ,..., ej 

a 

= E (sgn r7 ) a u l a 2 i 2 ■ ■ ■ a nin , where a = i 2 -■ ■ i„ 


Hence, D is the determinant function, and so the theorem is proved. 


SUPPLEMENTARY PROBLEMS 


Computation of Determinants 
8.38. Evaluate: 


(a) 


2 6 
4 1 

(b) 

5 1 

3 -2 

L w 

-2 8 
-5 -3 

(d) 

4 9 

1 -3 

(e) 


a + b 
b 


a 

a + b 


8.39. Find all t such that (a) 


-4 

2 




4 

t-2 


= 0 


8.40. Compute the determinant of each of the following matrices: 


'2 i r 


'3 -2 -4' 


"-2 -1 4' 


'7 6 5' 

0 5-2 

1 -3 4 

, (b) 

2 5-1 

0 6 1 

. (c) 

6 -3 -2 

4 1 2 

, (d) 

1 2 1 

3 -2 1 


(a) 
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8.41. Find the determinant of each of the following matrices: 


0 -2 


3 -1 

4 -3 


1 -2 


1 -2 
4 3 


2 -1 


8.42. Evaluate: 


2 

-1 

3 

-4 


2 

-1 

4 

-3 


1 

-2 

3 

-1 

2 

1 

-2 

1 

, (b) 

-1 

1 

0 

2 

, (c) 

1 

1 

-2 

0 

3 

3 

-5 

4 

3 

2 

3 

-1 

2 

0 

4 

-5 

5 

2 

-1 

4 


1 

-2 

2 

-3 


1 

4 

4 

-6 


8.43. Evaluate each of the following determinants: 


1 

2 

-1 

3 

1 


1 

3 

5 

7 

9 


1 

2 

3 

4 

5 

2 

-1 

1 

-2 

3 


2 

4 

2 

4 

2 


5 

4 

3 

2 

1 

3 

1 

0 

2 

-1 

, (b) 

0 

0 

1 

2 

3 

. (c) 

0 

0 

6 

5 

1 

5 

1 

2 

-3 

4 


0 

0 

5 

6 

2 


0 

0 

0 

7 

4 

-2 

3 

-1 

1 

-2 


0 

0 

2 

3 

1 


0 

0 

0 

2 

3 


Cofactors, Classical Adjoints, Inverses 

8.44. Find det(A), adj A, and A -1 , where 


"1 

1 

0 " 

[ 1 

2 

2 ' 

1 

1 

1 

, (b) A= 3 

1 

0 

0 

2 

1 

[l 

1 

1 


8.45. Find the classical adjoint of each matrix in Problem 8.41. 

8.46. LetA = ° ^ .(a) Find adj A, (b) Show that adj(adj A) = A, (c) When does A = adj A? 

8.47. Show that if A is diagonal (triangular) then adj A is diagonal (triangular). 

8.48. Suppose A = [a,-] is triangular. Show that 

(a) A is invertible if and only if each diagonal element a u / 0. 

(b) The diagonal elements of A -1 (if it exists) are a - 1 , the reciprocals of the diagonal elements of A. 


Minors, Principal Minors 


8.49. Let A = 


12 3 2 

10-23 

3- 125 

4- 3 0-1 


5 and B= 0 -5 


13-15 
2-314 
0-521 
3 0 5 -2 


Find the minor and the signed minor 


corresponding to the following submatrices: 

(a) A(l,4; 3,4), (b) B(l,4; 3,4), (c) A(2,3; 2,4), (d) B( 2,3; 2,4). 

8.50. For k = 1,2, 3, find the sum S k of all principal minors of order k for 


"1 3 2] [15-4] [1-43 

(a) A= 2 -4 3 , (b) B= 2 6 1 , (c) C = 2 1 5 

5 -2 1 3 -2 0 4 -7 11 
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8.51. For k = 1,2, 3,4, find the sum S k of all principal minors of order k for 


"1 

2 

3 

- 1 " 


'1 

2 

1 

2 ' 

1 

-2 

0 

5 

, (b) B = 

0 

1 

2 

3 

0 

1 

-2 

2 

1 

3 

0 

4 

4 

0 

-1 

-3 


2 

7 

4 

5 


Determinants and Linear Equations 

8.52. Solve the following systems by determinants: 


(a) 


f 3x + 5y = 8 
\Ax — 2y = 1 ’ 


(b) 


f 2v-3y= -1 
\ 4x + 7y = — 1 ’ 


(c) 


J ax — 2 by = c 
\ 3 ax — 5 by = 2c 


(ab ^ 0 ) 


8.53. Solve the following systems by determinants: 


(a) 


2x — 5y + 2z = 2 
x + 2y - 4z = 5 , (b) 
3x — 4y — 6z= 1 


2z+3 = y + 3x 
x - 3z = 2y + 1 
3y + z = 2 — 2x 


8.54. Prove Theorem 8.11: The system AX = 0 has a nonzero solution if and only if D = |A| = 0. 


Permutations 

8.55. Find the parity of the permutations a = 32154, x = 13524, n = 42531 in S 5 . 

8.56. For the permutations in Problem 8.55, find 

(a) t°< 7 , (b) n° a, (c) ct -1 , (d) t -1 . 

8.57. Let r 6 X„. Show that t° a runs through S n as a runs through 5 1 ,,, that is, S n = {r° a : a G S n }. 

8.58. Let a 6 S n have the property that a(n) = n. Let a* £ S n _ l be defined by a*(x) = <j(x). 

(a) Show that sgn a* = sgn a , 

(b) Show that as <r runs through S„, where a(n) = n, a* runs through ; that is, 

S n _i = {a* : ff6 S„,a(n) = «}. 

8.59. Consider a permutation a = j t j 2 ■ ■ ■j„■ Let {e,} be the usual basis of K", and let A be the matrix whose ith 

row is e h [i.e., A = (e- , ..., )]. Show that |A| = sgn a. 

Determinant of Linear Operators 

8.60. Find the determinant of each of the following linear transformations: 

(a) T:R 2 —> R 2 defined by T(x,y) = (2x — 9y, 3x — 5y), 

(b) T:R 3 —> R 3 defined by T(x,y,z) = (3x — 2z, 5y + Iz, x + y + z), 

(c) T :R 3 —> R 2 defined by T(x, y, z) = (2x + 7y — 4z, 4x — 6v + 2z). 

8.61. Let D:V —> V be the differential operator; that is, D(/(f)) = df /dt. Find det(D) if V is the vector space of 
functions with the following bases: (a) { 1 , t,..., f 5 }, (b) {e',e 2 ',e ir }, (c) {sin t, cost}. 

8.62. Prove Theorem 8.13: Let F and G be linear operators on a vector space V. Then 

(i) det(F° G) = det(F) det(G), (ii) F is invertible if and only if det(F) 76 0. 

8.63. Prove (a) det(l v ) = 1, where l v is the identity operator, (b) det(7’~ 1 ) = det(7 ’) _1 when T is invertible. 
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Miscellaneous Problems 

8.64. Find the volume V(S) of the parallelopiped S in R 3 determined by the following vectors: 

(a) m, = (1,2,-3), u 2 = (3,4,-1), u 3 = (2,-1,5), 

(b) m, = (1,1,3), u 2 = (1,-2,-4), u 3 = (4,1.5). 

8.65. Find the volume V(S) of the parallelepiped S in R 4 determined by the following vectors: 

u\ = (1,-2,5,-1), « 2 = (2,1, —2,1), u 3 = (3,0,1 -2), u 4 = ( 1,-1,4,-1) 

8.66. Let V be the space of 2 x 2 matrices M = ^ ^ over R. Determine whether D:V —> R is 2-linear (with 

respect to the rows), where L 

(a) D(M) = a + d, (c) D(M) = ac - bd, (e) D(M) = 0 

(b) D(M) = ad, (d) D(M) = ab - cd, (f) D(M) = 1 

8.67. Let A be an n-square matrix. Prove \kA\ = k"\A\. 

r a b 

8 . 68 . Let A, B, C,D be commuting n-square matrices. Consider the 2n-square block matrix M = ^ .Prove 

that \M\ = |A||D| — |B||C|. Show that the result may not be true if the matrices do not commute. 

8.69. Suppose A is orthogonal; that is, A r A = I. Show that det(A) = ±1. 

8.70. Let V be the space of m-square matrices viewed as w-tuples of row vectors. Suppose D:V —> K is nz-linear 
and alternating. Show that 

(a) D(..., A,..., S,...) = — D(... ,B,..., A,...); sign changed when two rows are interchanged. 

(b) If A l ,A 2 ,... ,A m are linearly dependent, then D(A l ,A 2 ,... ,A,„) = 0. 

8.71. Let V be the space of m-square matrices (as above), and suppose D:V —> K. Show that the following weaker 
statement is equivalent to D being alternating: 

D(Aj,A 2 , ... ,A„) =0 whenever A ; = A i+1 for some i 

Let V be the space of n-square matrices over K. Suppose B G V is invertible and so det(B) ^ 0. Define 
D: V —> K by D(A) = det(AB)/det(B), where A 6 V. Hence, 

D(A 1 ,A 2 ,...,A„) = det(A,B,A 2 B,... ,A„B)/det(B) 

where A, is the zth row of A, and so A t B is the zth row of AB. Show that D is multilinear and alternating, and 
that D(I) = 1. (This method is used by some texts to prove that |AB| = |A||B|.) 

8.72. Show that g = g(jq,... ,x n ) = (—l)"V,,_i(x) where g = g(x t ) is the difference product in Problem 8.19, 
x = x„, and V n _ l is the Vandermonde determinant defined by 

"1 1 ... 1 1 

X\ X 2 ... X n _! X 

V n -iW= xf 4 x 2 „_i x 2 

yJl— 1 yJl— 1 yJl— 1 yJl—l 

_ A 1 x 2 * • ' x n -1 A 

8.73. Let A be any matrix. Show that the signs of a minor A[I,J ] and its complementary minor A[7',/'] are 
equal. 
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8.74. Let A be an n-square matrix. The determinantal rank of A is the order of the largest square submatrix of A 
(obtained by deleting rows and columns of A) whose determinant is not zero. Show that the determinantal 
rank of A is equal to its rank—the maximum number of linearly independent rows (or columns). 


ANSWERS TO SUPPLEMENTARY PROBLEMS 


Notation: M =\R { \ R 2 , ...] denotes a matrix with rows R l ,R 2 ,.... 


8.38. 

(a) 

-22, 

(b) -13, 

(c) 46, 

(d) 

—21, (e) a 2 + ab + b 2 


8.39. 

(a) 

3, 10; 

(b) 5,-2 





8.40. 

(a) 

21 , 

(b) -11, 

(c) 100, 

(d) 

0 


8.41. 

(a) 

-131, 

(b) -55 





8.42. 

(a) 

33, 

(b) 0, (c) 45 




8.43. 

(a) 

-32, 

(b) -14, 

(c) -468 




8.44. 

(a) 

|A| = 

-2, adj A = 

= [-1, -i, i; 

-1.1 

,-l; 2,-2,0], 



(b) 

|A| = 

-1, adj A = 

= [1.0, -2; - 

-3,-1 

,6; 2,1,—5]. Also, A * = 

(adj A)/|A| 

8.45. 

(a) 

[-16, 

-29,-26,-2; 

-30,-38, 

-16,: 

29; -8,51,-13,-1; 

—13.1,28,-18], 


(b) 

[21,- 

14,-17,-19; 

-44,11,33 

.11; 

-29,1,13,21; 17,7, 

-19,-18] 

8.46. 

(a) 

adj A : 

= [d, ~b; -c, 

a], (c) A 

= kl 



8.49. 

(a) 

-3,- 

3, (b) -23 

1,-23, (c) 

3,- 

3, (d) 17,-17 


8.50. 

(a) 

-2,- 

17,73, (b) 

7,10,105, 

(c) 

13,54,0 


8.51. 

(a) 

-6.13,62,-219; 

(b) 7, -37,30,20 



8.52. 

(a) 

V- 21 
26 

>>’ = § (b) 

x= ~T3^y = 

1 . 

“ 13’ 

( C ) .v- 


8.53. 

(a) 

x = 5, 

y = 2,z= i, 

(b) Because D = 

0 , the system cannot be solved by determinants. 

8.55. 

(a) 

sgn <7 

= 1, sgn r = — 1 

. , sgn n = — 1 




8.56. 

(a) 

T o (7 = 

= 53142, (b) 

n° a = 52413, 

(c) a~ l = 32154, (d) 

t- 1 = 14253 

8.60. 

(a) 

det(T) 

= 17, (b) 

det(T’) = 4, 

(c) 

not defined 


8.61. 

(a) 

0, 

(b) 6, (c) 

1 




8.64. 

(a) 

18, 

(b) 0 





8.65. 

17 







8.66. 

(a) 

no. 

(b) yes. 

(c) yes, (d) no, (e) yes, (f) no 







Diagonalization: 
Eigenvalues and Eigenvectors 


9.1 Introduction 


The ideas in this chapter can be discussed from two points of view. 

Matrix Point of View 

Suppose an n-square matrix A is given. The matrix A is said to be diagonalizable if there exists a 
nonsingular matrix P such that 

B = P l AP 

is diagonal. This chapter discusses the diagonalization of a matrix A. In particular, an algorithm is given to 
find the matrix P when it exists. 

Linear Operator Point of View 

Suppose a linear operator T: V —> V is given. The linear operator T is said to be diagonalizable if there 
exists a basis S of V such that the matrix representation of T relative to the basis S' is a diagonal matrix D. 
This chapter discusses conditions under which the linear operator T is diagonalizable. 

Equivalence of the Two Points of View 

The above two concepts are essentially the same. Specifically, a square matrix A may be viewed as a linear 
operator F defined by 

F(X) = AX 

where X is a column vector, and B = P 1 A P represents F relative to a new coordinate system (basis) 
S whose elements are the columns of P. On the other hand, any linear operator T can be represented by a 
matrix A relative to one basis and, when a second basis is chosen, T is represented by the matrix 

B = p- l AP 

where P is the change-of-basis matrix. 

Most theorems will be stated in two ways: one in terms of matrices A and again in terms of linear 
mappings T. 

Role of Underlying Field K 

The underlying number field K did not play any special role in our previous discussions on vector spaces 
and linear mappings. However, the diagonalization of a matrix A or a linear operator T will depend on the 
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roots of a polynomial A(f) over K, and these roots do depend on K. For example, suppose A(f) = f 2 + 1. 
Then A(f) has no roots if K — R, the real field; but A(f) has roots ±/ if K = C, the complex field. 
Furthermore, finding the roots of a polynomial with degree greater than two is a subject unto itself 
(frequently discussed in numerical analysis courses). Accordingly, our examples will usually lead to those 
polynomials A(f) whose roots can be easily determined. 


9.2 Polynomials of Matrices 


Consider a polynomial/(f) = a n t" -\— ■ + cqf + a 0 over a field K. Recall (Section 2.8) that if A is any 
square matrix, then we define 

f(A) = o n A n + • • • + cqA + ciq! 


where / is the identity matrix. In particular, we say that A is a root of /(f) if /(A) = 0, the zero matrix. 


EXAMPLE 9.1 Let A 


1 2 
3 4 


. Then A 2 = 


7 10 

15 22 


. Let 


/(f) - 2f 2 - 3f + 5 


and 


g(t ) = f 2 - 5f - 2 


Then 



"14 20" 


-3 -6" 


"5 0" 


"16 14" 

/(A) = 2A 2 - 3A + 5/ = 

30 44 

+ 

-9 -12 

+ 

0 5 

— 

21 37 


and 


g{A) = A 2 - 5A - 21 = 


7 

15 


10 

22 


-5 -10 

-15 -20 


-2 0 
0 -2 


0 0 
0 0 


Thus, A is a zero of g(t). 


The following theorem (proved in Problem 9.7) applies. 

THEOREM 9.1: Let / and g be polynomials. For any square matrix A and scalar k, 

(i) ( / + g)( A ) = f(A ) + g( A ) (iii) (¥)( A ) = ¥( A ) 

(ii) (fg)( A ) =f( A )g( A ) (iv) f(A)g(A) = g(A)f(A). 

Observe that (iv) tells us that any two polynomials in A commute. 


Matrices and Linear Operators 

Now suppose that T: V —» V is a linear operator on a vector space V. Powers of T are defined by the 
composition operation: 

T 2 = T° T, T 3 = T 2 o T, 

Also, for any polynomial /(f) = a n f + ■ ■ ■ + a x t + a 0 , we define f(T) in the same way as we did for 
matrices: 

f(T) — a n T n 4*-1- a x T + a 0 I 

where I is now the identity mapping. We also say that T is a zero or root of /(f) if f(T) = 0, the zero 
mapping. We note that the relations in Theorem 9.1 hold for linear operators as they do for matrices. 

Remark: Suppose A is a matrix representation of a linear operator T. Then /(A) is the matrix 
representation of f(T), and, in particular, f(T) — 0 if and only if /(A) = 0. 
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9.3 Characteristic Polynomial, Cayley-Hamilton Theorem 


Let A = [aJ be an n-square matrix. The matrix M = A — tl n , where I n is the 72 -square identity matrix and t 
is an indeterminate, may be obtained by subtracting t down the diagonal of A. The negative of M is the 
matrix tl n — A, and its determinant 

A (t) — det (tl n — A) — ( — 1)" det(A — tl n ) 

which is a polynomial in t of degree n and is called the characteristic polynomial of A. 

We state an important theorem in linear algebra (proved in Problem 9.8). 

THEOREM 9.2: (Cayley-Hamilton) Every matrix A is a root of its characteristic polynomial. 

Remark: Suppose A = [a^] is a triangular matrix. Then tl — A is a triangular matrix with diagonal 
entries t — a u \ hence, 

A (t) = det (tl — A) — (t — a n )(t - a 22 ) ■ ■ ■ (t - a nn ) 


Observe that the roots of A (t) are the diagonal elements of A. 


EXAMPLE 9.2 Let A 


. Its characteristic polynomial is 


A (t) = \tl — A\ = 


t - 1 -3 

-4 t - 5 


= (t - 1)0 - 5) - 12 = f 2 - 6t - 7 


As expected from the Cayley-Hamilton theorem, A is a root of A(2); that is, 


A (A) = A 2 - 6A - 11 = 


13 18 

24 37 


-6 -18 
-24 -30 


-7 0 

0 -7 


Now suppose A and B are similar matrices, say B = P l AP, where P is invertible. We show that A and 
B have the same characteristic polynomial. Using tl = P l tIP, we have 

A B 0) = det(f/ - B) = det (tl - p- l AP) = det {p- l tIP - p- l AP) 

= det[P _1 (^ — A)P] = det(P^ 1 ) det (tl — A) det(P) 

Using the fact that determinants are scalars and commute and that dct( P 1 ) det(P) = 1, we finally obtain 
A B (t) = det (tl — A) = A A (t) 

Thus, we have proved the following theorem. 


THEOREM 9.3: Similar matrices have the same characteristic polynomial. 


Characteristic Polynomials of Degrees 2 and 3 

There are simple formulas for the characteristic polynomials of matrices of orders 2 and 3. 

(a) Suppose A = 0,1 fl|2 .Then 

l a 2l a 22_ 

A (f) = t 1 — (a n + a 22 )t + det(A) = t 1 — tr(A) t + det(A) 

Here tr(A) denotes the trace of A—that is, the sum of the diagonal elements of A. 
a U a l2 a 13 

(b) Suppose A = a 2 i a 22 a 23 ■ Then 

_ a 31 a 32 a 33. 

A(t) = t — tr(A) f 2 + (A[j + A 2 o T A 22 )t — det(A) 


(Here A n , A 22 , A 33 denote, respectively, the cofactors of a n , a 22 , o 33 .) 
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EXAMPLE 9.3 Find the characteristic polynomial of each of the following matrices: 


(a) A = 


5 3 
2 10 


(b) B = 


7 -1 
6 2 


(O C = 


5 -2 
4 -4 


(a) We have tr(A) = 5 + 10 = 15 and |A| = 50 — 6 = 44; hence, A (t) +1 2 — 15t + 44. 

(b) We have tr(5) = 7 + 2 = 9 and |B| = 14 + 6 = 20; hence, A(t) = t 2 — 9t + 20. 

(c) We have tr(C) = 5 — 4 = 1 and |C| = —20 + 8 = —12; hence, A(f) = t 2 — t — 12. 


EXAMPLE 9.4 Find the characteristic polynomial of A = 


1 1 2 
0 3 2 
1 3 9 


We have tr(A) = 1 + 3 + 9= 13. The cofactors of the diagonal elements are as follows: 


An — 


3 

3 



A 22 — 


1 2 
1 9 


= 7, 


A 33 — 


1 

0 


1 

3 


= 3 


Thus, A n + A 22 + A 33 = 31. Also, |A| = 27 + 2 + 0 — 6 — 6 — 0 = 17. Accordingly, 

A (t) = t i - 13r + 31r — 17 

Remark: The coefficients of the characteristic polynomial A(t) of the 3-square matrix A are, with 
alternating signs, as follows: 

s i = tr(A), S 2 = An + A 22 + A 33 , S 3 = det(A) 

We note that each S k is the sum of all principal minors of A of order k. 

The next theorem, whose proof lies beyond the scope of this text, tells us that this result is true in 
general. 

THEOREM 9.4: Let A be an n-square matrix. Then its characteristic polynomial is 
A (t) = f - S x f- X + S 2 f- 2 + • ■ ■ + (-1 )"S„ 
where S k is the sum of the principal minors of order k. 


Characteristic Polynomial of a Linear Operator 

Now suppose T:V^V is a linear operator on a vector space V of finite dimension. We define the 
characteristic polynomial A(t') of T to be the characteristic polynomial of any matrix representation of T. 
Recall that if A and B are matrix representations of T, then B = P 1 A P. where P is a change-of-basis 
matrix. Thus, A and B are similar, and by Theorem 9.3, A and B have the same characteristic polynomial. 
Accordingly, the characteristic polynomial of T is independent of the particular basis in which the matrix 
representation of T is computed. 

Because f(T) — 0 if and only if /(A) = 0, where f(t) is any polynomial and A is any matrix 
representation of T, we have the following analogous theorem for linear operators. 

THEOREM 9 . 2 ': (Cayley-Hamilton) A linear operator T is a zero of its characteristic polynomial. 
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9.4 Diagonalization, Eigenvalues and Eigenvectors 


Let A be any n-square matrix. Then A can be represented by (or is similar to) a diagonal matrix 
D = diagf/cj, k 2 , • •., k n ) if and only if there exists a basis S consisting of (column) vectors u ,. u 2 ..... u n 
such that 

Au x = k { u | 

Au 2 = k 2 u 2 

Au„ = k„u„ 

In such a case, A is said to be diagonizable. Furthermore, D = P X AP, where P is the nonsingular matrix 
whose columns are, respectively, the basis vectors u 1: u 2 ,..., u„. 

The above observation leads us to the following definition. 

DEFINITION: Let A be any square matrix. A scalar X is called an eigenvalue of A if there exists a 

nonzero (column) vector v such that 

Av = Xv 

Any vector satisfying this relation is called an eigenvector of A belonging to the 
eigenvalue X. 

We note that each scalar multiple kv of an eigenvector v belonging to X is also such an eigenvector, 
because 

A(kv) — k(Av ) = k(Xv) = X(kv) 

The set E 2 of all such eigenvectors is a subspace of V (Problem 9.19), called the eigenspace of X. (If 
dim E, = 1, then E, is called an eigenline and X is called a scaling factor.) 

The terms characteristic value and characteristic vector (or proper value and proper vector) are 
sometimes used instead of eigenvalue and eigenvector. 

The above observation and definitions give us the following theorem. 

THEOREM 9.5: An /i-squarc matrix A is similar to a diagonal matrix D if and only if A has n linearly 
independent eigenvectors. In this case, the diagonal elements of D are the corresponding 
eigenvalues and D = P l AP, where P is the matrix whose columns are the eigenvectors. 

Suppose a matrix A can be diagonalized as above, say P~ l AP = D, where D is diagonal. Then A has the 
extremely useful diagonal factorization: 

A = PDP 1 

Using this factorization, the algebra of A reduces to the algebra of the diagonal matrix D, which can be 
easily calculated. Specifically, suppose D = &i&g(k x ,k 2 ,... . k n ). Then 

A'" = ( PDP - 1 )'" = PD"‘P 1 = P diag(Af,.. -,kf)P ' 

More generally, for any polynomial/(f), 

/(A) =f(PDP- 1 ) = Pf(D)p- 1 = P diag(. f{h)J{k 2 ),... J(k n ))p-' 

Furthermore, if the diagonal entries of D are nonnegative, let 

B = P diag( y/k u s/k 2 ,..., fk n ) P~ l 

Then B is a nonnegative square root of A; that is, B 2 = A and the eigenvalues of B are nonnegative. 
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EXAMPLE 9.5 Let A 


Av r = 


3 1 

2 2 


and let v l = 


1 

-2 


and v 7 = 


. Then 


'3 r 

r 


r 

2 2 

-2 


-2 


= v. 


and 


Av 2 = 


'3 r 

V 


'4' 

2 2 

1 


4 


= 4u 0 


Thus, tq and v 2 are eigenvectors of A belonging, respectively, to the eigenvalues = 1 and X 2 = 4. Observe that 
and v 2 are linearly independent and hence form a basis of R 2 . Accordingly, A is diagonalizable. Furthermore, let P be 
the matrix whose columns are the eigenvectors iq and v 2 . That is, let 

P = 


1 

1 


" 1 

1 " 

, and so P 1 = 

3 

3 

-2 

1 

2 

1 




_ 3 

3 _ 


Then A is similar to the diagonal matrix 
D = P l AP = 


' i i ' 

3 3 

[3 ll 

l f 


’i o’ 

2 1 

3 3 

1 

<N 

<N 

_i 

-2 1 


0 4 


1 

f 

1 

o’ 

" l 

3 

1 " 

3 

-2 

1 

0 

4 

2 

.3 

1 

3. 


As expected, the diagonal elements 1 and 4 in D are the eigenvalues corresponding, respectively, to the eigenvectors 
V\ and v 2 , which are the columns of P. In particular, A has the factorization 


A = PDF 1 = 
Accordingly, 


A 4 = 


Moreover, suppose/(f) = f 3 — 5f 2 + 3f + 6 ; hence,/(l) = 5 and/(4) = 2. Then 
f(A) = Pf(D)p- 1 = 


1 

l’ 

’l 

o’ 

" i 

3 

1 ‘ 

3 


171 

85’ 

-2 

1 

0 

256 

2 

.3 

1 

3. 


170 

86 


1 _ 

■ i r 

'5 O' 

"1 r 

3 3 


'3-1' 


.-2 1. 

.0 2 . 

2 1 

.3 3. 


.-2 4. 


Last, we obtain a “positive square root” of A. Specifically, using \/l = 1 and \f\ = 2, we obtain the matrix 
B = P\[DP- X = 


' 1 

r 

1 

O' 

■ 1 

3 

1 " 

3 


"5 

3 

1" 

3 

.-2 

1 . 

.0 

2 . 

2 

.3 

1 

3 _ 


2 

.3 

4 

3. 


where B 2 = A and where B has positive eigenvalues 1 and 2. 

Remark: Throughout this chapter, we use the following fact: 


If P = 


a b 
c d 


. then P 1 = 


d/\P\ -b/\P\ 

-c/\P\ a/\P\ 


That is, P 1 is obtained by interchanging the diagonal elements a and d of P. taking the negatives of the 
nondiagonal elements b and c, and dividing each element by the determinant \P\. 


Properties of Eigenvalues and Eigenvectors 

Example 9.5 indicates the advantages of a diagonal representation (factorization) of a square matrix. 
In the following theorem (proved in Problem 9.20), we list properties that help us to find such a 
representation. 
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THEOREM 9.6: Let A be a square matrix. Then the following are equivalent. 

(i) A scalar X is an eigenvalue of A. 

(ii) The matrix M = A — XI is singular. 

(iii) The scalar X is a root of the characteristic polynomial Aft) of A. 

The eigenspace E- of an eigenvalue X is the solution space of the homogeneous system MX = 0, where 
M — A — XI\ that is, M is obtained by subtracting X down the diagonal of A. 

Some matrices have no eigenvalues and hence no eigenvectors. However, using Theorem 9.6 and the 
Fundamental Theorem of Algebra (every polynomial over the complex field C has a root), we obtain the 
following result. 

THEOREM 9.7: Let A be a square matrix over the complex field C. Then A has at least one eigenvalue. 

The following theorems will be used subsequently. (The theorem equivalent to Theorem 9.8 for linear 
operators is proved in Problem 9.21, and Theorem 9.9 is proved in Problem 9.22.) 

THEOREM 9.8: Suppose V \, v 2 , • •., v n are nonzero eigenvectors of a matrix A belonging to distinct 
eigenvalues X lt X 2 , ■ ■ ■ ,X n . Then v l ,v 2 , ■ ■ ■ ,v n are linearly independent. 

THEOREM 9.9: Suppose the characteristic polynomial Aft) of an n-square matrix A is a product of n 
distinct factors, say. Aft) = ft — ajft — a 2 ) ■ ■ ■ ft — a n ). Then A is similar to the 
diagonal matrix D = diag(a l5 a 2 , ..., a n ). 

If X is an eigenvalue of a matrix A, then the algebraic multiplicity of X is defined to be the multiplicity of 
X as a root of the characteristic polynomial of A, and the geometric multiplicity of X is defined to be the 
dimension of its eigenspace, dim E A . The following theorem (whose equivalent for linear operators is 
proved in Problem 9.23) holds. 

THEOREM 9.10: The geometric multiplicity of an eigenvalue X of a matrix A does not exceed its 
algebraic multiplicity. 

Diagonalization of Linear Operators 

Consider a linear operator T:V^V. Then T is said to be diagonalizable if it can be represented by a 
diagonal matrix D. Thus, T is diagonalizable if and only if there exists a basis S = {//,, u 2 ..... u n ) of V for 
which 


T( u ^) — k[U[ 

T{u 2 ) = k 2 u 2 


T(u n ) = k n u n 

In such a case, T is represented by the diagonal matrix 
D = diagfAq ,k 2 ,...,k n ) 
relative to the basis S. 

The above observation leads us to the following definitions and theorems, which are analogous to the 
definitions and theorems for matrices discussed above. 

DEFINITION: Let I be a linear operator. A scalar X is called an eigenvalue of T if there exists a 

nonzero vector v such that T(v) = Xv. 

Every vector satisfying this relation is called an eigenvector of T belonging to the 
eigenvalue X. 
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The set E 2 of all eigenvectors belonging to an eigenvalue 2 is a subspace of V, called the 
eigenspace of X. (Alternatively, X is an eigenvalue of T if XI - 7' is singular, and, in this case, E } is the 
kernel of XI — T.) The algebraic and geometric multiplicities of an eigenvalue X of a linear operator T are 
defined in the same way as those of an eigenvalue of a matrix A. 

The following theorems apply to a linear operator T on a vector space V of finite dimension. 

THEOREM 9.5': T can be represented by a diagonal matrix D if and only if there exists a basis S of V 

consisting of eigenvectors of T. In this case, the diagonal elements of D are the 
corresponding eigenvalues. 

THEOREM 9.6': Let T be a linear operator. Then the following are equivalent: 

(i) A scalar X is an eigenvalue of T. 

(ii) The linear operator XI — T is singular. 

(iii) The scalar X is a root of the characteristic polynomial A(t') of T. 

THEOREM 9.7': Suppose V is a complex vector space. Then T has at least one eigenvalue. 

THEOREM 9.8': Suppose v 1 ,v 2 ,---, v n are nonzero eigenvectors of a linear operator T belonging to 
distinct eigenvalues X { ,X 2 , ■ ■ ■ ■ X n . Then v 1 , v 2 , ■ ■ ■, v„ are linearly independent. 

THEOREM 9.9': Suppose the characteristic polynomial A(f) of 7" is a product of n distinct factors, say, 
A(f) — (t — aX)(t - a 2 ) ■ ■ ■ (t — a,,). Then T can be represented by the diagonal matrix 
D = diag (a u a 2 ,... ,a„). 

THEOREM 9.10': The geometric multiplicity of an eigenvalue X of T does not exceed its algebraic 
multiplicity. 

Remark: The following theorem reduces the investigation of the diagonalization of a linear operator 
T to the diagonalization of a matrix A. 

THEOREM 9.11: Suppose A is a matrix representation of T. Then T is diagonalizable if and only if A is 
diagonalizable. 


9.5 Computing Eigenvalues and Eigenvectors, Diagonalizing Matrices 

This section gives an algorithm for computing eigenvalues and eigenvectors for a given square matrix A 
and for determining whether or not a nonsingular matrix P exists such that P 1 A P is diagonal. 

ALGORITHM 9.1: (Diagonalization Algorithm) The input is an n-square matrix A. 

Step 1. Find the characteristic polynomial A(f) of A. 

Step 2. Find the roots of A(t) to obtain the eigenvalues of A. 

Step 3. Repeat (a) and (b) for each eigenvalue X of A. 

(a) Form the matrix M = A — XI by subtracting X down the diagonal of A. 

(b) Find a basis for the solution space of the homogeneous system MX = 0. (These basis 
vectors are linearly independent eigenvectors of A belonging to X.) 



302 


CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors 


Step 4. Consider the collection S = { v ,, v 2 , ■ ■ ■, v m } of all eigenvectors obtained in Step 3. 

(a) If m =/=■ n. then A is not diagonalizable. 

(b) If m = n, then A is diagonalizable. Specifically, let P be the matrix whose columns are the 
eigenvectors v 1 ,v 2 ,...,v n . Then 


D = P i AP = diag(/l 1 ,/l 2 ,... ,X n ) 
where is the eigenvalue corresponding to the eigenvector v r 


EXAMPLE 9.6 The diagonalization algorithm is applied to A = ^ , 

(1) The characteristic polynomial A (t) of A is computed: 

tr(A) =4—1 = 3, |A| = -4-6 = -10, so A(f) = t 2 - 3t - 10 = (t - 5){t + 2) 

(2) Set A (t) = r — 3? — 10= (t — 5)(f+ 2) = 0. The roots 2] = 5 and A 2 = 2 are the eigenvalues of A. 

(3) (i) We find an eigenvector V\ of A belonging to the eigenvalue Ax = 5. Subtract Ai = 5 down the diagonal 

of A to obtain the matrix 


M = 



2 

-6 


and the homogeneous system 


—x + 2y = 0 
3x - 6y = 0 


or - x + 2y = 0 


The system has only one independent solution, for example, Vi = (2, 1). 

(ii) We find an eigenvector v 2 of A belonging to the eigenvalue X 2 = —2. Subtract —2 (or add 2) 
down the diagonal of A to obtain the matrix 


M = 


6 

3 


2 

1 


6a + 2y — 0 

and the homogeneous system ' or 3x + y — 0 

3a + y = 0 


The system has only one independent solution, for example, v 2 = ( — 1, 3). 


(4) Let P be the matrix whose columns are the eigenvectors Vi and v 2 ■ Then P = 


2 -1 

1 3 


. Thus 


D = P AP is the diagonal matrix whose diagonal entries are the corresponding eigenvalues: 


- 3/7 

1/7' 

'4 

2' 

'2 -r 


'5 

O' 

_ — 1/7 

2/7. 

.3 

-1. 

.1 3. 


.0 

-2. 


Remark: Find a 2 x 2 matrix A with eigenvalues =2 and ), 2 = 3 and corresponding eigenvectors 


Vi = (1, 3) and v 2 = (1, 4). We know that P { AP = D where P = 


[Here P 1 = 


4 -1 
-3 1 


Thus A = PDP 1 = 


1 1 

3 4 


and D = 


2 0 
0 3 


1 _ 

'1 f 

'2 O' 


4 - f 


-i r 


3 4 

0 3 


-3 1 


-12 6 


EXAMPLE 9.7 Consider the matrix B = 


-1 

3 


. We have 


tr(.B) = 5 + 3 = 8, |B| = 15 + 1 = 16; so A(f) = r - 8t + 16 = (t - 4) 2 


Accordingly, X = 4 is the only eigenvalue of B. 
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Subtract 2 = 4 down the diagonal of B to obtain the matrix 


M 


I 

1 


and the homogeneous system 


x - y = 0 
x - y = 0 


or x — y = 0 


The system has only one independent solution; for example, x = l,y = 1. Thus, v = (1,1) and its multiples are the 
only eigenvectors of B. Accordingly, B is not diagonalizable, because there does not exist a basis consisting of 
eigenvectors of B. 


EXAMPLE 9.8 Consider the matrix A = 


3 -5 
2 -3 


. Here tr(A) = 3 — 3 = 0 and |A| = —9 + 10 = 1. Thus, 


A(f) = t 2 + 1 is the characteristic polynomial of A. We consider two cases: 

(a) A is a matrix over the real field R. Then A(?) has no (real) roots. Thus, A has no eigenvalues and no eigenvectors, 
and so A is not diagonalizable. 


(b) A is a matrix over the complex field C. Then A(t) = (f — i)(t + i) has two roots, i and — i. Thus, A has two 
distinct eigenvalues i and —i, and hence, A has two independent eigenvectors. Accordingly there exists a 
nonsingular matrix P over the complex field C for which 


P X AP = 


i 0 

o 


Therefore, A is diagonalizable (over C). 


9.6 Diagonalizing Real Symmetric Matrices and Quadratic Forms 


There are many real matrices A that are not diagonalizable. In fact, some real matrices may not have any 
(real) eigenvalues. However, if A is a real symmetric matrix, then these problems do not exist. Namely, we 
have the following theorems. 

THEOREM 9 . 12 : Let A be a real symmetric matrix. Then each root A of its characteristic polynomial is 
real. 

THEOREM 9 . 13 : Let A be a real symmetric matrix. Suppose u and v are eigenvectors of A belonging to 
distinct eigenvalues and 2 2 . Then u and v are orthogonal, that; is, (u, v) = 0. 

The above two theorems give us the following fundamental result. 

THEOREM 9 . 14 : Let A be a real symmetric matrix. Then there exists an orthogonal matrix P such that 
D = P l AP is diagonal. 


The orthogonal matrix P is obtained by normalizing a basis of orthogonal eigenvectors of A as 
illustrated below. In such a case, we say that A is “orthogonally diagonalizable.” 

EXAMPLE 9.9 Let A = 

diagonal. 

First we find the characteristic polynomial A(t) of A. We have 


, a real symmetric matrix. Find an orthogonal matrix P such that P l AP is 


tr(A) = 2 + 5 = 7, |A| = 10 — 4 = 6; so A (t) — t 2 — It + 6 = (t — 6 )(t — 1) 


Accordingly, A 1 = 6 and X 2 = 1 are the eigenvalues of A. 

(a) Subtracting A l = 6 down the diagonal of A yields the matrix 


M = 




and the homogeneous system 


—4x - 2y = 0 
—2x — y = 0 


or 2x + y = 0 


A nonzero solution is u x = (1,-2). 



304 


CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors 


(b) Subtracting / 2 = I down the diagonal of A yields the matrix 


M = 



-2 

4 


and the homogeneous system x — 2y = 0 


(The second equation drops out, because it is a multiple of the first equation.) A nonzero solution is 

u 2 = ( 2 , 1 ). 

As expected from Theorem 9.13, iq and u 2 are orthogonal. Normalizing ;q and u 2 yields the orthonormal vectors 

u l = (l/v / 5, — 2/VS) and u 2 — (2/V5, l/v/5) 

Finally, let P be the matrix whose columns are iq and u 2 , respectively. Then 


1/V5 2/y/5' 

and P ^ AP = 

'6 O' 

-2/V5 l/v/5. 


0 1 


As expected, the diagonal entries of P l AP are the eigenvalues corresponding to the columns of P. 

The procedure in the above Example 9.9 is formalized in the following algorithm, which finds an 
orthogonal matrix P such that P l AP is diagonal. 

ALGORITHM 9.2: (Orthogonal Diagonalization Algorithm) The input is a real symmetric matrix A. 

Step 1. Find the characteristic polynomial A (t) of A. 

Step 2. Find the eigenvalues of A, which are the roots of A(f). 

Step 3. For each eigenvalue X of A in Step 2, find an orthogonal basis of its eigenspace. 

Step 4. Normalize all eigenvectors in Step 3, which then forms an orthonormal basis of R". 

Step 5. Fet P be the matrix whose columns are the normalized eigenvectors in Step 4. 


Application to Quadratic Forms 

Fet q be a real polynomial in variables x l ,x 2 , ■ ■ ■ .x n such that every term in q has degree two; that is, 
q(x x ,x 2 ,...,x„) = £erf + Y dyXjXj, where c,-, d y € R 

i i<j 

Then q is called a quadratic form. If there are no cross-product terms x ( -x- (i.e., all dy = 0), then q is said to 
be diagonal. 

The above quadratic form q determines a real symmetric matrix A = [af, where a u — Cj and 
ay = Ay, = 2 dy. Namely, q can be written in the matrix form 

q(X) = X t AX 

where X = [x l ,x 2 ,... ,x n ] T is the column vector of the variables. Furthermore, suppose X = PY is a linear 
substitution of the variables. Then substitution in the quadratic form yields 

q(Y) = ( PY) t A(PY ) = Y t (P t AP)Y 

Thus, P t AP is the matrix representation of q in the new variables. 

We seek an orthogonal matrix P such that the orthogonal substitution X = PY yields a diagonal 
quadratic form for which P J AP is diagonal. Because P is orthogonal, P T = P~ l , and hence, 
P t AP = P l AP. The above theory yields such an orthogonal matrix P. 
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EXAMPLE 9.10 Consider the quadratic form 

q(x,y ) = 2x 2 — 4xy + 5 y 2 = X T AX, where 
By Example 9.9, 


A = 


2 -2 
-2 5 


and X — 


P l AP — 


6 0 
0 1 


= P J AP, 


where 


P = 


l/y/5 2/-\/5 
-2/y/5 l/y/5 


Let Y = [j, t] T . Then matrix P corresponds to the following linear orthogonal substitution x = PY of the variables x 
and y in terms of the variables s and f: 


1 2 

x = —=s + —pf, 
y/5 V5 


y = —+ —j=t 
y/5 V5 


This substitution in q(x,y) yields the diagonal quadratic form q(s, t) = 6 s 2 + t 1 . 


9.7 Minimal Polynomial 


Let A be any square matrix. Let 7(A) denote the collection of all polynomials /(f) for which A is a root— 
that is, for which/(A) = 0. The set 7(A) is not empty, because the Cayley-Hamilton Theorem 9.1 tells us 
that the characteristic polynomial A A (t) of A belongs to 7(A). Let m(t) denote the monic polynomial of 
lowest degree in 7(A). (Such a polynomial m(t ) exists and is unique.) We call / 77 (f) the minimal polynomial 
of the matrix A. 


Remark: A polynomial /(f) ^ 0 is monic if its leading coefficient equals one. 

The following theorem (proved in Problem 9.33) holds. 

THEOREM 9.15: The minimal polynomial m(t) of a matrix (linear operator) A divides every 
polynomial that has A as a zero. In particular, m(t) divides the characteristic 
polynomial A (t) of A. 

There is an even stronger relationship between m(t) and A(7). 

THEOREM 9.16: The characteristic polynomial A (f) and the minimal polynomial m(t) of a matrix A 
have the same irreducible factors. 

This theorem (proved in Problem 9.35) does not say that / 77 (f) = A(f), only that any irreducible factor of 
one must divide the other. In particular, because a linear factor is irreducible, m(t) and A(f) have the same 
linear factors. Hence, they have the same roots. Thus, we have the following theorem. 


THEOREM 9.17: A scalar X is an eigenvalue of the matrix A if and only if X is a root of the minimal 
polynomial of A. 


EXAMPLE 9.11 Find the minimal polynomial 777 (f) of A = 


2 2-5 

3 7 -15 

1 2 -4 


First find the characteristic polynomial A(f) of A. We have 
tr(A) = 5, Aj j T A02 T A 33 = 2 — 3 + 8=7, and |A| = 3 


Hence, 


A(f) = f 3 - 5t 2 + It - 3 - (f - 1 ) 2 (f - 3) 



306 


CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors 


The minimal polynomial m(f) must divide A(f). Also, each irreducible factor of A(f) (i.e., t — 1 and t — 3) must 
also be a factor of m(f). Thus, m(t) is exactly one of the following: 

fit) = 1) or g(t) = (t-3)(t- l) 2 

We know, by the Cayley-Hamilton theorem, that g(A) = A (A) = 0. Hence, we need only test/(f). We have 


/(A) — (A — I) (A — 31) — 

'1 2 -5' 

3 6 -15 


"-1 2 -5' 

3 4 -15 

— 

'0 0 O' 
0 0 0 


LT) 

1 

(N 


1 2 -7 


o 

O 

O 


Thus, /(f) = m{t) = (f — l)(f — 3) = t 2 — At + 3 is the minimal polynomial of A. 

EXAMPLE 9.12 

(a) Consider the following two r-square matrices, where a ^ 0: 


A 

1 

0 ... 

0 

0 



[A 

a 

0 ... 

0 

0 

0 

A 

1 ... 

0 

0 

and 

A = 

0 

A 

a ... 

0 

0 

0 

0 

0 ... 

A 

1 

0 

0 

0 ... 

A 

a 

0 

0 

0 ... 

0 

A 



0 

0 

0 ... 

0 

A 


The first matrix, called a Jordan Block, has A’s on the diagonal, l’s on the superdiagonal (consisting of the entries 
above the diagonal entries), and 0’s elsewhere. The second matrix A has A’s on the diagonal, a’s on the 
superdiagonal, and 0’s elsewhere. [Thus, A is a generalization of 7(A, r).] One can show that 

is both the characteristic and minimal polynomial of both ./(A, r) and A. 

(b) Consider an arbitrary monic polynomial: 

f(t) — l" + * + •■■ + fljf + 

Let C(f) be the n-square matrix with l’s on the subdiagonal (consisting of the entries below the diagonal 
entries), the negatives of the coefficients in the last column, and 0’s elsewhere as follows: 



0 

0 

... 0 

-«0 


1 

0 

... 0 

—a { 

c (/) = 

0 

1 

... 0 

~a 2 


_0 

0 

... 1 

- “n 1 


Then C(f) is called the companion matrix of the polynomial/(f). Moreover, the minimal polynomial m(t) and the 
characteristic polynomial A(f) of the companion matrix C(f) are both equal to the original polynomial/(f). 


Minimal Polynomial of a Linear Operator 

The minimal polynomial m(t) of a linear operator T is defined to be the monic polynomial of lowest degree 
for which T is a root. However, for any polynomial /(f), we have 

f(T) = 0 if and only if /(A) = 0 

where A is any matrix representation of T. Accordingly, T and A have the same minimal polynomials. 
Thus, the above theorems on the minimal polynomial of a matrix also hold for the minimal polynomial of a 
linear operator. That is, we have the following theorems. 

THEOREM 9.15': The minimal polynomial m(t ) of a linear operator T divides every polynomial that 
has T as a root. In particular, m(t) divides the characteristic polynomial A(f) of T. 
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THEOREM 9.16': The characteristic and minimal polynomials of a linear operator T have the same 
irreducible factors. 

THEOREM 9.17': A scalar X is an eigenvalue of a linear operator T if and only if 1 is a root of the 
minimal polynomial m(t ) of T. 


9.8 Characteristic and Minimal Polynomials of Block Matrices 


This section discusses the relationship of the characteristic polynomial and the minimal polynomial to 
certain (square) block matrices. 


Characteristic Polynomial and Block Triangular Matrices 

"A, B 


Suppose M is a block triangular matrix, say M = 


0 A, 


, where A i and A 2 are square matrices. Then 


tl — M is also a block triangular matrix, with diagonal blocks tl — A x and tl — A 2 . Thus, 


\tl — M\ = 


tl — A, 
0 


-B 

tl - A, 


= \tl — A x \\tl — A 2 


That is, the characteristic polynomial of M is the product of the characteristic polynomials of the diagonal 
blocks A | and A 2 . 

By induction, we obtain the following useful result. 


THEOREM 9.18: Suppose M is a block triangular matrix with diagonal blocks A l , A 2 ,... ,A r . Then the 
characteristic polynomial of M is the product of the characteristic polynomials of the 
diagonal blocks A,; that is, 

MO = A Al (r)A^(r)...A Ar (f) 


EXAMPLE 9.13 Consider the matrix M = 


9 -1 i 5 


3_ _2_ —4 

0 i 3 6 
0 '-I 8 


Then M is a block triangular matrix with diagonal blocks A = 


9 -1 
8 3 


and B = 


3 6 
-1 8 


. Here 


tr(A) = 9 + 3 = 12, det(A) — 27 + 8 = 35, and so 
tr (B) — 3 + 8 = 11, det(B) — 24 + 6 = 30, and so 

Accordingly, the characteristic polynomial of M is the product 


A A (t) = t 2 - 12 1 + 35 = (t- 5)(t - 7) 
A B (t) = t 2 - 1 It + 30 = (t - 5)(t - 6) 


Am(*) = A A(t) A n(t) = (* ~ 5) (t- 6)(t - 7) 


Minimal Polynomial and Block Diagonal Matrices 

The following theorem (proved in Problem 9.36) holds. 

THEOREM 9.19: Suppose M is a block diagonal matrix with diagonal blocks A l ,A 2 ,... ,A r . Then the 
minimal polynomial of M is equal to the least common multiple (LCM) of the 
minimal polynomials of the diagonal blocks A,-. 

Remark: We emphasize that this theorem applies to block diagonal matrices, whereas the analogous 
Theorem 9.18 on characteristic polynomials applies to block triangular matrices. 
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EXAMPLE 9.14 Find the characteristic polynomal A(t) and the minimal polynomial m(t) of the block diagonal 
matrix: 


M = 


2 

5 

o 

0 

1 0 

0 

2 

0 

0 

1 0 

0“ 

“O' 

4" 

2 

u 

0 

0 

3 

5 

1 0 

0 

0 

0 

0 

T7 


= diag(A 1 ,A 2 ,A 3 ), where A { = 


2 5 
0 2 


)^2 


4 2 
3 5 




[V] 


Then A(f) is the product of the characterization polynomials A,(f), A 2 (f), A 3 (f) of A 1; A 2 , A 3 , respectively. 
One can show that 


A,(f) = (f-2) 2 , A 2 (t) = (t - 2)(r - 7), A 3 (f) = f-7 

Thus, A(f) = (f — 2 ) 3 (f — 7) 2 . [As expected, deg A(f) = 5.] 

The minimal polynomials m, (f), m 2 {t), m 3 (t) of the diagonal blocks AjjAt^, respectively, are equal to the 
characteristic polynomials; that is, 

m l (t) — (t — 2) 2 , m 2 [t) — (t — 2)(t— 1), m 3 (t) = t — l 

But m(t) is equal to the least common multiple of m^f), w 2 (t),m 3 (f). Thus, m(t) = (t — 2) 2 (f — 7). 


SOLVED PROBLEMS 


Polynomials of Matrices, Characteristic Polynomials 


9.1. Let A = 


1 -2 

4 5 


. Find/(A), where 
(a) f(t) =t 2 - 3t + 7, 


(b) f(t) = t 2 — 6t + 13 


First find A* = 


'l -2' 

'1 

-2 


—7 

-12' 

4 5 

4 

5 


24 

17 


. Then 


-7 -12 

24 17 


+ 


-3 6 

-12 -15 


+ 


7 0 
0 7 


-3 -6 
12 9 


(a) /(A) = A" - 3A + 7/ = 

(b) /(A) =A 2 -6A+ 13/= 
[Thus, A is a root of/(f).’ 


9.2. Find the characteristic polynomial A(f) of each of the following matrices: 


'-7 

-12' 


-6 

12' 


'13 

O' 


'0 

0 

24 

17 

+ 

-24 

-3° 

+ 

0 

13 

— 

0 

0 


(a) A = 


2 5 
4 1 


, (b) B = 


7 -3 
5 -2 


, (c) C = 


3 -2 
9 -3 


Use the formula (f) = t 2 — tr(M) f + |Mj for a 2 x 2 matrix M: 

(a) tr(A) = 2+1 = 3, |A| = 2 — 20 = —18. so A(f) = f 2 - 3f - 18 

(b) tr(fi) = 7-2 = 5, |fi| = -14+ 15 = 1, so A(f) = f 2 -5f+l 

(c) tr(C) =3-3 = 0, |C| = —9+18 = 9, so A(f) = f 2 + 9 


9.3. Find the characteristic polynomial A(f) of each of the following matrices: 



CO 

<N 


16-2' 

II 

3 0 4 

6 4 5 

,(b) B = 

-3 2 0 

0 3-4 
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Use the formula A(t) = / 3 — tr(A)/ 2 + (A n + A 22 + A 33 )/ — |A|, where A u is the cofactor of a u in the 
3x3 matrix A = [a (/ ]. 

(a) tr(A) = 1+ 0 + 5 = 6, 


An — 


0 4 
4 5 


= -16, 

An + A 22 + A 33 = “35 


. _ i 3 

Az2 “ 6 5 


= -13, 


A 33 — 


1 2 


Thus, 

(b) tr(5) = 1 + 2-4= -1 
S„ = 


3 0| 

and |A| = 48 + 36 - 16 - 30 = 38 
A(/) = t 3 - 6r - 35 1 - 38 


= -6 


2 0 
3 -4 


Btj — 


1 -2 
0 -4 


= -4, 


5 33 — 


1 6 

-3 2 


= 20 


+ B 22 + S 33 — 8, 


and 


15 = -8+ 18 - 72 = -62 


Thus, A(r) = P + r — 8/ + 62 

9.4. Find the characteristic polynomial A(/) of each of the following matrices: 


(a) A = 


'2 

5 

1 

r 


'1 

1 

2 

2" 

1 

4 

2 

2 

, (b) 5- 

0 

3 

3 

4 

0 

0 

6 

-5 

0 

0 

5 

5 

0 

0 

2 

3 


0 

0 

0 

6 


(a) A is block triangular with diagonal blocks 

'2 5 
1 4 


A i — 


and 


A 2 — 


6 -5 
2 3 


Thus, A(f) = A Ai (/) A Ai (/) = (t 2 - 6/ + 3) (r - 9/ + 28) 

(b) Because 5 is triangular, A (t) = (t — l)(r — 3)(f — 5)(f — 6). 

9.5. Find the characteristic polynomial A (t) of each of the following linear operators: 

(a) F: R 2 — > R 2 defined by F(x,y) = (3x + 5y, 2x — ly). 

(b) D: V —> V defined by D (f) = df/dt, where V is the space of functions with basis 
S = {sin t, cost}. 

The characteristic polynomial A (t) of a linear operator is equal to the characteristic polynomial of any 
matrix A that represents the linear operator. 

(a) Find the matrix A that represents T relative to the usual basis of R 2 . We have 


A = 


3 5 
2 -7 


A (t) = t 1 — tr(A) t + \A\ = t 1 + At — 31 


(b) Find the matrix A representing the differential operator D relative to the basis S. We have 


D(sint) = cosr = 0(sinr) + l(cosf) 


and so 


A = 


0 -1 

1 0 


D(cost) = — sin/ = — 1 (sin r) + 0(cos/) 

Therefore, A(/) = r — tr(A) t + |A| = f 2 + 1 

9.6. Show that a matrix A and its transpose A r have the same characteristic polynomial. 

By the transpose operation, (// — A) T = tI T — A r = tl — A T . Because a matrix and its transpose have the 
same determinant, 

A A (t) = | tl - A| = |(/7 - A) t | = \tl - A t | = A AT (t) 
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9.7. Prove Theorem 9.1: Let / and g be polynomials. For any square matrix A and scalar k, 

(i) (/ + g)( A ) = f( A ) + g( A )> (iii) (¥)( A ) = ¥( A ), 

(ii) (fg)( A ) =f( A )g( A ), (iv) f(A)g(A) = g(A)f(A). 

Suppose/ = a n f + • • • + a l t + a 0 and g = b m t m + • • • + /;/ + b 0 . Then, by definition, 

/(A) = a n A n ■ ■ • — ct ] A + OqI and g (A) — b m A nl + • • ■ + h | A + Z?g/ 

(i) Suppose m < n and let £>,- = 0 if i > m. Then 

/ + g — ( a n + b„)f -\ - \- (a x + b x )t + (a Q + b 0 ) 

Hence, 

(/ + g)( A ) = ( a n + t>„)A n + • • • + (fli + b x )A + (a 0 + b Q )I 

— a n A n + b„A" + • • • + a x A + b\A + ciqI + b+ — f(A) + g(A ) 

n+m 

(ii) By definition, fg = c n+m t n+m H-1- c x t + c 0 = ^2 c k where 

k= o 

k 

c k ~ a obk + a \bk-\ + • • • + a k b o = J2 a fik-i 

j=0 

n+m 

Hence, (fg)(A) = £ c iA k and 

k=0 

/ n \ / m \ n m _ «+ra 

f(A)g(A) = ( E “A ‘) ( E bA J ) =J2 It a i b A' +1 = E <*A* = (fg)(A) 

\i =o / \j =o / i—oy—o r=o 

(iii) By definition, kf = &a„t" + • ■ ■ + faqt + ka 0 , and so 

(. kf)(A ) = ka n A n + • • • + feqA + faz 0 7 = k(a n A H + • • • + a x A + a 0 I) = kf{A) 

(iv) By (ii), g{A)f{A) = ( gf)(A) = ( fg)(A ) =f(A)g(A). 

9.8. Prove the Cayley-Hamilton Theorem 9.2: Every matrix A is a root of its characterstic polynomial 

AM- 

Let A be an arbitrary n-square matrix and let A(?) be its characteristic polynomial, say, 

A(t) = |tl — A| — f -j- Q n _ x t + • • • + Cl x t CCq 

Now let B(t) denote the classical adjoint of the matrix tl — A. The elements of B(t) are cofactors of the matrix 
tl — A and hence are polynomials in t of degree not exceeding n — 1. Thus, 

B(t) = B n -\F 1 +- \-B x t + B 0 

where the B, are n-square matrices over K which are independent of t. By the fundamental property of the 
classical adjoint (Theorem 8.9), (tl — A)B(t) = \tl — A\I, or 

{tl — A){B U _+ 1 + • • • + B x t + B 0 ) — (t n + a n _+ 1 -j- • • • + ci x t + Oq)I 

Removing the parentheses and equating corresponding powers of t yields 

B n _ l =I, B n _ 2 -AB„_ 1 =a n _ 1 I, ■■■, B 0 — AB l =a x I , —AB 0 = a 0 I 

Multiplying the above equations by A", A" -1 , ..., A, /, respectively, yields 

A"B n _ 1 = A n I, A"- i B„- 2 -A''B„-i =«„-iA"-', .... AB 0 — A 2 B X = a x A, -AB 0 = a 0 I 

Adding the above matrix equations yields 0 on the left-hand side and A(A) on the right-hand side; that is, 

0 = A n a n _ x A" ' -{-*** T Q\A -j- ci+ 

Therefore, A(A) = 0, which is the Cayley-Hamilton theorem. 
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Eigenvalues and Eigenvectors of 2 x 2 Matrices 

9.9. Let A = [)? ~t\. 


(a) Find all eigenvalues and corresponding eigenvectors. 

(b) Find matrices P and D such that P is nonsingular and D = P l AP is diagonal, 
(a) First find the characteristic polynomial A (?) of A: 

A (?) = r - tr(A) t + \A\ = r + 3t- 10 = (t-2)(t + 5) 


The roots A = 2 and A = —5 of A(f) are the eigenvalues of A. We find corresponding eigenvectors. 

(i) Subtract A = 2 down the diagonal of A to obtain the matrix M = A — 21, where the corresponding 
homogeneous system MX = 0 yields the eigenvectors corresponding to A = 2. We have 


M = 


1 

2 



corresponding to 


x — 4 y = 0 
2x — 8 y = 0 


or x — 4v = 0 


The system has only one free variable, and v l = (4,1) is a nonzero solution. Thus, v l = (4,1) is 
an eigenvector belonging to (and spanning the eigenspace of) A = 2. 

(ii) Subtract A = —5 (or, equivalently, add 5) down the diagonal of A to obtain 


M = 


8 

2 



corresponding to 


8x - 4y = 0 
2x — y = 0 


or 2x — y = 0 


The system has only one free variable, and v 2 = (1,2) is a nonzero solution. Thus, v 2 = (1,2) is 
an eigenvector belonging to A = 5. 

(b) Let P be the matrix whose columns are v ] and v 2 . Then 


P = 


4 

1 


1 

2 


and 


D = P l AP = 


2 

0 


0 

-5 


Note that D is the diagonal matrix whose diagonal entries are the eigenvalues of A corresponding to the 
eigenvectors appearing in P. 


Remark: Here P is the change-of-basis matrix from the usual basis of R 2 to the basis 
S = {tq, v 2 }, and D is the matrix that represents (the matrix function) A relative to the new basis S. 

9.10. Let A = 2 2 . 

(a) Find all eigenvalues and corresponding eigenvectors. 

(b) Find a nonsingular matrix P such that D = P l AP is diagonal, and P 1 . 

(c) Find A 6 and/(A), where t 4 — 3 f 3 — 6 1 2 + It + 3. 

(d) Find a “real cube root” of B —that is, a matrix B such that B 3 = A and B has real eigenvalues, 
(a) First find the characteristic polynomial A(f) of A: 

A (?) = r - tr(A) t + |A| = t 2 - 5t + 4 = (t - l)(f - 4) 

The roots A = 1 and A = 4 of A(f) are the eigenvalues of A. We find corresponding eigenvectors. 

(i) Subtract A = 1 down the diagonal of A to obtain the matrix M = A — A/, where the corresponding 
homogeneous system MX = 0 yields the eigenvectors belonging to A = 1. We have 


1 

1 


2 

2 


M = 


corresponding to 


x + 2 y = 0 
x + 2y = 0 


or 


x + 2y = 0 
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The system has only one independent solution; for example, x = 2, y = — 1. Thus, v l = (2,-1) is 
an eigenvector belonging to (and spanning the eigenspace of) A = 1. 

(ii) Subtract A = 4 down the diagonal of A to obtain 


M = 


-2 

1 



corresponding to 


— 2x + 2y = 0 
x — y = 0 


or x — y = 0 


The system has only one independent solution; for example, x = 1, y = 1. Thus, v 2 = (1,1) is an 
eigenvector belonging to A = 4. 

(b) Let P be the matrix whose columns are Vj and v 2 . Then 


2 r 



'1 

0 ~ 

and 

D = P-'AP = 

-i i 



0 

4 


" t _ l' 

where P~ l = 2 \ 

.3 3. 


(c) Using the diagonal factorization A = PDP 1 , and l 6 = 1 and 4 6 = 4096, we get 


2 

f 

"l o' 

■ i r 

3 3 


'1366 

2230' 

-1 

1 

0 4096 

1 2 

.3 3 _ 


1365 

2731 


Also, /(l) 


2 and/(4) = — 1. Hence, 


f(A) = Pf(D)P~ l 


2 f 

to 

o 

_i 

"i r 

3 3 


1 2 

-1 1 

0 -1 

1 2 

.3 3 _ 


-1 0 


(d) Here 


0 

^4 


is the real cube root of D. Hence the real cube root of A is 


B = P\ZdP~ 1 


2 f 

'l 0 ' 

"i r 

3 3 

i 

2 + \[A 

-2 + 2 f/A 

-1 1 

0 ^4 

1 2 

.3 3 _ 

“3 

_-1 + n/4 

1 + 2\J~A 


9.11. Each of the following real matrices defines a linear transformation on R 2 : 


(a) A 


5 

3 



B = 


1 

2 


, (c) C = 


5 -1 

1 3 


Find, for each matrix, all eigenvalues and a maximum set S of linearly independent eigenvectors. 
Which of these linear operators are diagonalizable—that is, which can be represented by a diagonal 
matrix? 


(a) First find A(f) = t 2 — 3t — 28 = (t — l)(t + 4). The roots A = 7 and A = —4 are the eigenvalues of A. 
We find corresponding eigenvectors. 

(i) Subtract A = 7 down the diagonal of A to obtain 


M = 


-2 

3 



corresponding to 


— 2x + 6y = 0 
3x - 9y = 0 


or x — 3y = 0 


Here tij = (3,1) is a nonzero solution. 

(ii) Subtract A = —4 (or add 4) down the diagonal of A to obtain 


M = 


9 6 
3 2 ’ 


corresponding to 


9x + 6y = 0 
3v + 2_y = 0 


or 


3x + 2_y = 0 


Here v 2 = (2, —3) is a nonzero solution. 

Then S = {u 1; v 2 } = {(3,1), (2, —3)} is a maximal set of linearly independent eigenvectors. Because S is 
a basis of R 2 , A is diagonalizable. Using the basis S, A is represented by the diagonal matrix D = diag(7, —4). 
(b) First find the characteristic polynomial A(f) = t 2 + 1. There are no real roots. Thus B, a real matrix 
representing a linear transformation on R 2 , has no eigenvalues and no eigenvectors. Hence, in particular, 
B is not diagonalizable. 
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(c) First find A(f) = t 2 — 8f + 16 = (f — 4) 2 . Thus, X = 4 is the only eigenvalue of C. Subtract X = 4 down the 
diagonal of C to obtain 


M = 


1 

1 



corresponding to x — y = 0 


The homogeneous system has only one independent solution; for example, x= 1, y= 1. Thus, 
v = (1,1) is an eigenvector of C. Furthermore, as there are no other eigenvalues, the singleton set 
5={'u} = {(l,l)}isa maximal set of linearly independent eigenvectors of C. Furthermore, because S 
is not a basis of R 2 , C is not diagonalizable. 


9.12. Suppose the matrix B in Problem 9.11 represents a linear operator on complex space C 2 . Show that, 
in this case, B is diagonalizable by finding a basis S of C 2 consisting of eigenvectors of B. 

The characteristic polynomial of B is still A (t) = t 2 + 1. As a polynomial over C, A (t) does factor; 
specifically, A(f) = (t — i)(t + i). Thus, X = i and X = —i are the eigenvalues of B. 

(i) Subtract X = i down the diagonal of B to obtain the homogeneous system 


(1 — i)x— y = 0 

2x + (—1 — i)y = 0 


or (1 — i)x — y = 0 


The system has only one independent solution; for example, x = 1, y = 1 — i. Thus, = (1, 1 — i) is 
an eigenvector that spans the eigenspace of X = i. 

(ii) Subtract X = —i (or add i) down the diagonal of B to obtain the homogeneous system 


(1 + i)x- y = 0 

2x+ (— 1 + i)y = 0 


or (1 + i)x — y = 0 


The system has only one independent solution; for example, x = 1, y = 1 + i. Thus, v 2 = (1, 1 + i) is 
an eigenvector that spans the eigenspace of X = —i. 

As a complex matrix, B is diagonalizable. Specifically, S = {«!, v 2 } = {(1,1 — i), (1,1 + i)} is a basis of 
C 2 consisting of eigenvectors of B. Using this basis S, B is represented by the diagonal matrix 
D = diag(f —i). 

9.13. Let L be the linear transformation on R 2 that reflects each point P across the line y = kx, where 
k > 0. (See Fig. 9-1.) 


(a) Show that v l — (k, I j and v 2 — (1, —k) are eigenvectors of L. 


(b) Show that L is diagonalizable, and find a diagonal representation D. 



(a) The vector U; = (k,l) lies on the line y = kx, and hence is left fixed by L; that is, L(v x ) = v l . Thus, Vj is 
an eigenvector of L belonging to the eigenvalue X 1 = 1. 

The vector v 2 = (1, —k) is perpendicular to the line y = kx, and hence, L reflects v 2 into its negative; 
that is, L(v 2 ) = — v 2 . Thus, v 2 is an eigenvector of L belonging to the eigenvalue X 2 = — 1. 
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(b) Here S = {i^, v 2 } is a basis of R 2 consisting of eigenvectors of L. Thus, L is diagonalizable, with the 

'l O' 


diagonal representation D = 


0 -1 


(relative to the basis S). 


Eigenvalues and Eigenvectors 


9.14. Let A = 


4 1 
2 5 

1 1 


-1 

-2 

2 


• (a) 


Find all eigenvalues of A. 


(b) Find a maximum set S of linearly independent eigenvectors of A. 

(c) Is A diagonalizable? If yes, find P such that D = P 1 A P is diagonal. 

(a) First find the characteristic polynomial A(r) of A. We have 

tr(A) = 4 + 5 + 2 = 11 and |A| = 40 - 2 - 2 + 5 + 8 - 4 = 45 


Also, find each cofactor A u of a u in A: 


An — 


5 

1 





1 

5 


18 


Hence, A(f) = P — tr(A) P + (A u +A 22 ±A 33 )f — |A| = P — lit 2 + 39 1 — 45 

Assuming At has a rational root, it must be among ±1, ±3, ±5, ±9, ±15, ±45. Testing, by 
synthetic division, we get 


3 


1 — 11 + 39 — 45 
3 - 24 ± 45 
1 — 8 + 15 ± 0 


Thus, t = 3 is a root of A (t). Also, t — 3 is a factor and t 2 — St ± 15 is a factor. Hence, 
A(t) = (t- 3 )(t 2 - 8 1 + 15) = (t- 3)(t - 5 )(t -3) = (t- 3 f(t - 5) 

Accordingly, A = 3 and A = 5 are eigenvalues of A. 

(b) Find linearly independent eigenvectors for each eigenvalue of A. 

(i) Subtract 1 = 3 down the diagonal of A to obtain the matrix 


M = 


1 1 -1 

2 2-2 
1 1 -1 


corresponding to x + y — z = 0 


Here u = (1, — 1,0) and v = (1,0,1) are linearly independent solutions, 
(ii) Subtract 2 = 5 down the diagonal of A to obtain the matrix 


M = 


-1 

2 

1 


1 

0 

1 



—x + y — z = 0 

corresponding to 2x — 2z = 0 or 
x ± y — 3z = 0 


x — z = 0 
y — 2z = 0 


Only z is a free variable. Here w = (1,2,1) is a solution. 

Thus, S = {u, v,w} = {(1 + 1,0), (1,0,1), (1,2,1)} is a maximal set of linearly independent 
eigenvectors of A. 


Remark: The vectors u and v were chosen so that they were independent solutions of the system 
x + y — z — 0. On the other hand, w is automatically independent of u and v because w belongs to a 
different eigenvalue of A. Thus, the three vectors are linearly independent. 
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(c) A is diagonalizable, because it has three linearly independent eigenvectors. Let P be the matrix with columns 
u, v, w. Then 


1 

-1 

1 

0 

r 

2 

and 

II 

a, 

7 

a, 

II 

Q 

1 

U) 

_1 

0 

1 

1 



L 5j 


9.15. Repeat Problem 9.14 for the matrix B = 


3 -1 1 

7 -5 1 

6-6 2 


(a) First find the characteristic polynomial A (f) of B. We have 


tr(5)=0, p| = —16, B n — —4, B 22 = 0, - -8. 


SO 



Therefore, A (t) = f 3 — 12f + 16 = (f — 2)"(f + 4). Thus, £ = 2 and A 2 = —4 are the eigen¬ 
values of B. 

(b) Find a basis for the eigenspace of each eigenvalue of B. 

(i) Subtract Aj = 2 down the diagonal of B to obtain 


M = 


1 

7 

6 


-1 1 
-7 1 
-6 0 


x- y + j = 0 

corresponding to lx — ly + z = 0 or 
6x — 6y =0 


x - y + z = 0 
z = 0 


The system has only one independent solution; for example, x=l, y = 1, z = 0. Thus, 
u = (1,1,0) forms a basis for the eigenspace of = 2. 

(ii) Subtract A 2 = —4 (or add 4) down the diagonal of B to obtain 


M = 


7 

7 

6 


-1 1 

-1 1 

-6 6 


lx — y + z = 0 
corresponding to lx — y + z = 0 
6x — 6y + 6z = 0 


x — y + z = 0 
6 v — 6z = 0 


The system has only one independent solution; for example, x = 0, y = 1, z = 1. Thus, 
v = (0,1,1) forms a basis for the eigenspace of A 2 = —4. 

Thus S = {m, v} is a maximal set of linearly independent eigenvectors of B. 

(c) Because B has at most two linearly independent eigenvectors, B is not similar to a diagonal matrix; that is, 
B is not diagonalizable. 


9.16. Find the algebraic and geometric multiplicities of the eigenvalue A 1 — 2 of the matrix B in 
Problem 9.15. 

The algebraic multiplicity of A l = 2 is 2, because t — 2 appears with exponent 2 in A(f). However, the 
geometric multiplicity of = 2 is 1, because dim E 2 = 1 (where E 2i is the eigenspace of A t ). 

9.17. Let 7': R 3 ^ R 3 be defined by T(x, y, z) = (2x + y — 2z, 2x + 3y - 4 z, x + y — z). Find all 
eigenvalues of T, and find a basis of each eigenspace. Is T diagonalizable? If so, find the basis S of 
R 3 that diagonalizes T. and find its diagonal representation D. 

First find the matrix A that represents T relative to the usual basis of R 3 by writing down the coefficients 
of x, y, z as rows, and then find the characteristic polynomial of A (and T). We have 

tr(A) = 4, \A\ = 2 

and -4 11 = 4; ^22 = 0, 4 33 = 4 

£4, = 5 

i 

Therefore, A(/) = f 3 — 4r 2 + 5t — 2 = (t — l) 2 (t — 2), and so A = 1 and A = 2 are the eigenvalues of A (and 
T ). We next find linearly independent eigenvectors for each eigenvalue of A. 


A = [T\ = 


2 1 -2 
2 3-4 
1 1 -1 
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(i) Subtract 1=1 down the diagonal of A to obtain the matrix 


M = 


1 

2 

1 


1 

2 

1 


-2 

-4 

-2 


corresponding to x + y — 2z = 0 


Here y and z are free variables, and so there are two linearly independent eigenvectors belonging to 
1=1. For example, u = (1, —1,0) and v = (2,0,1) are two such eigenvectors. 

(ii) Subtract 1 = 2 down the diagonal of A to obtain 


M = 


0 1 
2 1 
1 1 


-2 

-4 

-3 


corresponding to 


y - 2z = 0 
2x + y - 4z = 0 
x + y — 3z = 0 


or 


x + y - 3z = 0 
y -2z = 0 


Only z is a free variable. Here w = (1,2,1) is a solution. 

Thus, T is diagonalizable, because it has three independent eigenvectors. Specifically, choosing 
S — {u,v,w} = {(1,—1,0), (2,0,1), (1,2,1)} 

as a basis, T is represented by the diagonal matrix D = diag(l, 1,2). 

9.18. Prove the following for a linear operator (matrix) T : 

(a) The scalar 0 is an eigenvalue of T if and only if T is singular. 

(b) If 1 is an eigenvalue of T, where T is invertible, then 1 _1 is an eigenvalue of T~ l . 

(a) We have that 0 is an eigenvalue of T if and only if there is a vector v ^ 0 such that T(v) = Ov —that is, if 
and only if T is singular. 

(b) Because T is invertible, it is nonsingular; hence, by (a), A ^ 0. By definition of an eigenvalue, there exists 
v 7 ^ 0 such that T(v) = kv. Applying T -1 to both sides, we obtain 

v = T~ l (kv) = kT~ l (v), and so T~ 1 (v)=k~ l v 

Therefore, k~ l is an eigenvalue of T -1 . 

9.19. Let k be an eigenvalue of a linear operator T: V —► V, and let E- consists of all the eigenvectors 
belonging to k (called the eigenspace of k). Prove that E A is a subspace of V. That is, prove 

(a) If it e £ a , then ku £ E A for any scalar k. (b) If u, v, £ then u + v £ E- k . 

(a) Because u £ E } _, we have T(u) = ku. Then T(ku) = kT(u) = k(ku) = k{ku ), and so ku 6 E A . 

(We view the zero vector 0 £ V as an “eigenvector” of k in order for E } to be a subspace of V.) 

(b) As u,vE E A , we have T(u) = ku and T(v) = kv. Then 

T{u + v) = T(u) + T(v) = ku + kv = k(u + v), and so u + v £ E A 

9.20. Prove Theorem 9.6: The following are equivalent: (i) The scalar k is an eigenvalue of A. 

(ii) The matrix kl — A is singular. 

(iii) The scalar k is a root of the characteristic polynomial A (t) of A. 

The scalar k is an eigenvalue of A if and only if there exists a nonzero vector v such that 

Av = kv or (kI)v — Av = 0 or (kl — A)v = 0 

or kl — A is singular. In such a case, A is a root of A (t) = \tl — A|. Also, v is in the eigenspace E A of k if and 
only if the above relations hold. Hence, v is a solution of (kl — A)X = 0. 


9.21. Prove Theorem 9.8': Suppose rtj, v 2 , ■ ■ ■, v n are nonzero eigenvectors of T belonging to distinct 
eigenvalues Aj, k 2 , ■ ■ ■, k n . Then v { , v 2 , ■ ■ ■, v n are linearly independent. 
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Suppose the theorem is not true. Let v 1 ,v 2 , ■ ■ ■, v s be a minimal set of vectors for which the theorem is 
not true. We have s > 1, because v t ^ 0. Also, by the minimality condition, v 2 ,...,v s are linearly 
independent. Thus, v x is a linear combination of v 2 ,... ,v s , say, 


Vi = a 2 v 2 + a 3 v 3 4 - b a s v s (1) 

(where some a k ^ 0). Applying T to (1) and using the linearity of T yields 

T(v x ) = T(a 2 v 2 + a 3 v 3 + • • • + a s v s ) = a 2 T(v 2 ) + a 3 T{v 3 ) + • • • + a s T{v s ) (2) 


Because Vj is an eigenvector of T belonging to Ay, we have T(vj) = XjVj. Substituting in (2) yields 

X\V x = a 2 X 2 v 2 + a 3 X 3 v 3 + • • • + a s X s v s (3) 


Multiplying (1) by Aj yields 

Xi v i = a iX\ v 2 + a 3 X x v 3 + •• • + a s X x v s (4) 

Setting the right-hand sides of (3) and (4) equal to each other, or subtracting (3) from (4) yields 

a 2 (A l — X 2 )v 2 + a 3 (2. l — 2 3 )tt 3 + • • • + a s (^t — K) v s = 0 (5) 

Because v 2 ,v 3 ,... ,v s are linearly independent, the coefficients in (5) must all be zero. That is, 

a 2 {k\ — Aj) = 0, a 3 (Aj — 2 3 ) = 0, ..., — A s ) = 0 

However, the A,- are distinct. Hence Aj — A ; - =A 0 for / > 1. Hence, a 2 = 0, a 3 = 0,..., a s = 0. This contra¬ 
dicts the fact that some a k ^ 0. The theorem is proved. 


9.22. Prove Theorem 9.9. Suppose A(t) — (t — a x ){t — a 2 ) ■ ■ ■ (t — a n ) is the characteristic polynomial of 
an n-square matrix A, and suppose the n roots o, are distinct. Then A is similar to the diagonal 
matrix D = diag(a 1 ,a 2 ,..., a„). 

Let Vi, v 2 ,..., v n be (nonzero) eigenvectors corresponding to the eigenvalues a t . Then the n eigenvectors 
Vj are linearly independent (Theorem 9.8), and hence fonn a basis of K". Accordingly, A is diagonalizable (i.e., 
A is similar to a diagonal matrix D). and the diagonal elements of D are the eigenvalues a ; . 

9.23. Prove Theorem 9.10': The geometric multiplicity of an eigenvalue A of T does not exceed its 
algebraic multiplicity. 

Suppose the geometric multiplicity of A is r. Then its eigenspace E k contains r linearly independent 
eigenvectors v x ,..., v r . Extend the set {v t } to a basis of V , say, {n ,-,... ,v r ,w x ,..., w^}. We have 

T{v l ) = Xv l , T(v 2 ) = Xv 2l ..., T(v r ) = Xv r , 

T{w i) = a n v x -t-b a lr v r + b n w 3 H-b b ls w s 

T(w 2 ) = a 2l v x + -b a 2r v r + ^t^t 3-+ b 2s w s 


T{w s ) = a sl Vj H-b a sr v r + b sl w x H-b b ss w s 


Then M = 


XI r 

0 


is the matrix of T in the above basis, where A = [a,-^ and B = [h l; ] 7 . 


Because M is block diagonal, the characteristic polynomial (f — A) r of the block XI r must divide the 
characteristic polynomial of M and hence of T. Thus, the algebraic multiplicity of A for T is at least r, as 
required. 


Diagonalizing Real Symmetric Matrices and Quadratic Forms 

7 3 

3 -1 


9.24. Let A = 


. Find an orthogonal matrix P such that D = P l AP is diagonal. 
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First find the characteristic polynomial A (t) of A. We have 

A(t) = r — tr(A) t + |A| = t 2 — 6t — 16 = (t — 8)(t + 2) 

Thus, the eigenvalues of A are X = 8 and X = —2. We next find corresponding eigenvectors. 
Subtract X = 8 down the diagonal of A to obtain the matrix 


-1 3 

3 -9 


corresponding to 


—x + 3y = 0 
3x - 9y = 0 


or x — 3v = 0 


A nonzero solution is u l = (3,1). 

Subtract X = —2 (or add 2) down the diagonal of A to obtain the matrix 


9 3 
3 1 ’ 


corresponding to 


9x + 3y = 0 
3x + y = 0 


or 3x + y = 0 


A nonzero solution is u 2 = (1,-3). 

As expected, because A is symmetric, the eigenvectors u l and u 2 are orthogonal. Normalize u x and u 2 to 
obtain, respectively, the unit vectors 

= (3/vT 0, l/\/T0) and u 2 = (l/i/IO, -3/VTO). 

Finally, let P be the matrix whose columns are the unit vectors u l and u 2 , respectively. Then 


' 3/VTO 

i/vTo" 

and D = P'AP= ^ ® 

l/VTo 

—3/vTo 



As expected, the diagonal entries in D are the eigenvalues of A. 

' 11 -8 4' 

9.25. Let B = —8 —1 —2 .(a) Find all eigenvalues of B. 

4 -2 —4_ 

(b) Find a maximal set S of nonzero orthogonal eigenvectors of B. 

(c) Find an orthogonal matrix P such that D = P 1 BP is diagonal. 

(a) First find the characteristic polynomial of B. We have 

tr(S) = 6, |B| = 400, B n = 0, B 22 = — 60, 5 33 = —75, so ^B„ = —135 


Hence, A(t) = t 3 — 61 2 — 135f — 400. If A(f) has an integer root it must divide 400. Testing t = —5, by 
synthetic division, yields 

-5 

1 - 11 - 80+ 0 

Thus, t + 5 is a factor of A(f), and t 1 — Ilf — 80 is a factor. Thus, 

A(r) = (t + 5)(t 2 - lit - 80) = (t + 5) 2 (r — 16) 

The eigenvalues of B are X = —5 (multiplicity 2), and 2=16 (multiplicity 1). 

(b) Find an orthogonal basis for each eigenspace. Subtract X = — 5 (or, add 5) down the diagonal of B to 
obtain the homogeneous system 

16x — 8y + 4z = 0, — 8x + 4y — 2z = 0, 4x — 2_v + z = 0 

That is, 4x — 2_v + z = 0. The system has two independent solutions. One solution is v t = (0,1,2). We 
seek a second solution v 2 = ( a,b,c ), which is orthogonal to 1 + such that 



4a — 2b + c = 0, 


and also b — 2c = 0 
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One such solution is v 2 = (—5, —8,4). 

Subtract k = 16 down the diagonal of B to obtain the homogeneous system 

— 5x — 8y + 4z = 0, —8x — 17y — 2z = 0, 4x — 2y — 20 z = 0 

This system yields a nonzero solution v 3 = (4, —2,1). (As expected from Theorem 9.13. the eigenvector 
v 3 is orthogonal to v x and v 2 .) 

Then v t . v 2 , v 3 form a maximal set of nonzero orthogonal eigenvectors of B. 

(c) Normalize v t , v 2 , v 3 to obtain the orthonormal basis: 

f) 1 = v 1 /v / 5, v 2 = v 2 /VT05, v 3 = v 3 /V21 

Then P is the matrix whose columns are {/, . v 2 . v 3 . Thus, 

0 — 5/vT05 4/v/2ll [-5 

P= l/v/5 — 8/n/T05 -2/V2A and D = P l BP= -5 

_2/v/5 4/VT05 l/v/2lj [ 16_ 


9 . 26 . Let q(x,y ) = x 2 + 6 xy — ly 2 . Find an orthogonal substitution that diagonalizes q. 

Find the symmetric matrix A that represents q and its characteristic polynomial A(f). We have 

A= ^ ^ and A(t) = t 2 + 6t — 16 = (t — 2){t + 8) 


The eigenvalues of A are k = 2 and k = —8. Thus, using j and t as new variables, a diagonal form of q is 

q(s, t) = 2s 2 — 8 1 2 

The corresponding orthogonal substitution is obtained by finding an orthogonal set of eigenvectors of A. 

(i) Subtract k = 2 down the diagonal of A to obtain the matrix 

corresponding to /' ^ ^ or — x + 3v = 0 

3x — 9v = 0 

A nonzero solution is u l = (3,1). 

(ii) Subtract k = — 8 (or add 8) down the diagonal of A to obtain the matrix 



corresponding to 


9x + 3y = 0 
3x+ y = 0 


3x + y = 0 


A nonzero solution is u 2 = (—1,3). 

As expected, because A is symmetric, the eigenvectors u 3 and u 2 are orthogonal. 

Now normalize u ] and u 2 to obtain, respectively, the unit vectors 

M, = (3/v/IO, l/v/l0) and u 2 = (— 1/VTo, 3/t/TO). 

Finally, let P be the matrix whose columns are the unit vectors u l and u 2 , respectively, and then 
[x, y] T = P[s, t] T is the required orthogonal change of coordinates. That is, 

3//K) -l/x/Tol , 3 s-t s + 3t 

1//10 3/v/ToJ \/i0 ' 

One can also express s and t in terms of x and y by using = P T . That is, 

3x + y — x + 3 1 

S = ^W’ t = ^m~ 
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Minimal Polynomial 



'4 -2 2' 


'3 -2 2" 

9 . 27 . Let A = 

6-3 4 

and B = 

4-4 6 


3-2 3 


2-3 5 


The characteristic polynomial of both matrices is 


A(f) = (t — 2)(f — 1)". Find the minimal polynomial m(t) of each matrix. 


The minimal polynomial m(t) must divide A (f). Also, each factor of A (?) (i.e., t — 2 and t — 1) must also 
be a factor of m(f). Thus, m(t) must be exactly one of the following: 


f(t) = (t-2)(t-l) or g(t) = (f- 2)(f- l) 2 


(a) By the Cayley-Hamilton theorem, g(A) = A(A) = 0, so we need only test /(f). We have 


/(A) = (A-2/)(A-/) = 

'2 -2 2' 
6-5 4 

'3 -2 2' 

6-4 4 

— 

1 

o o 

o o 

o o 

1_ 


3 -2 1 

3-2 2 


1 0 0 0 


Thus, m(t) = /(f) = (f — 2)(f — 1) = f 2 — 3f + 2 is the minimal polynomial of A. 

(b) Again g(B) = A (B) = 0, so we need only test/(f). We get 



'1 -2 2' 

'2 -2 2' 


"-2 2 -2' 

f(B) = (B-2I){B-I) = 

4-6 6 

2-3 3 

4-5 6 
2-3 4 

= 

-4 4 -4 

-2 2 -2 


Thus, m(t)^f(t). Accordingly, m(t) = g(t) = (f — 2)(f — l) 2 is the minimal polynomial of B. [We 
emphasize that we do not need to compute g(B)\ we know g(B) = 0 from the Cayley-Hamilton theorem.] 


9 . 28 . Find the minimal polynomial m(t) of each of the following matrices: 




"12 3“ 



'5 r 

3 7 

, (b) B = 

0 2 3 

, (C) c = 

'4 -1" 

1 2 



m 

o 

o 




(a) The characteristic polynomial of A is A (?) = t 1 — 12f + 32 = (f — 4)(f — 8). Because A (f) has distinct 
factors, the minimal polynomial m(t) = A(f) = t 2 — 12 1 + 32. 

(b) Because B is triangular, its eigenvalues are the diagonal elements 1,2,3; and so its characteristic 
polynomial is A (t) = (t — l)(t — 2)(t — 3). Because A(t) has distinct factors, m(t) = A(t). 

(c) The characteristic polynomial of C is A(t) = f 2 — 6r + 9 = (t — 3) 2 . Hence the minimal polynomial of C is 
/(f) = / — 3 or g(t ) = (f — 3) 2 . However, /(C) / 0; that is, C — 31 ^ 0. Hence, 


m(t) = g{t) = A(f) = (f- 3) 2 . 


9 . 29 . Suppose S = {u x , u 2 , ■ ■ ■, n„} is a basis of V, and suppose F and G are linear operators on V such 
that [F] has 0’s on and below the diagonal, and [G] has ci/0 on the superdiagonal and 0’s 
elsewhere. That is, 


O O 

a 2\ 

0 

a 3l 

a 32 

• • • a„ 1 

• • • a nl 

, [G] = 

o o 

a 

0 

0 

a 

... 0 

... 0 

0 

0 

0 

a n,n -1 

0 

0 

0 

a 

0 

0 

0 

0 


0 

0 

0 

... 0 


Show that (a) F" = 0, (b) G" 1 ^ 0, but G" = 0. (These conditions also hold for [F] and [G].) 

(a) We have F («[) = 0 and, for r > 1, F(u r ) is a linear combination of vectors preceding u r in S. That is, 

F(u r ) = a rl u 1 + a r2 u 2 H-b a rr _ x u,._^ 
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Hence, F 2 {u r ) = F{F{u r )) is a linear combination of vectors preceding u r _ ] , and so on. Hence, 
F'\u r ) = 0 for each r. Thus, for each r, F"(u r ) = F"~ r ( 0) = 0, and so F n = 0, as claimed. 

(b) We have G(jq) = 0 and, for each k > 1, G(u k ) = au k _\. Hence, G'\u k ) = a r u k _ r for r < k. Because a / 0, 
a «-t q Therefore, G" _l (w„) = a n ~ l u l / 0, and so G" _1 / 0. On the other hand, by (a), G" = 0. 


9 . 30 . Let B be the matrix in Example 9.12(a) that has l's on the diagonal, a’s on the superdiagonal, 
where a ^ 0, and 0’s elsewhere. Show that /(f) = (t — A)” is both the characteristic polynomial 
A (f) and the minimum polynomial ni(t) of A. 

Because A is triangular with A’s on the diagonal, A(f) = /(f) = (f — A)" is its characteristic polynomial. 
Thus, m(t) is a power of t — A. By Problem 9.29, (A — A//” 1 / 0. Hence, m(t) = A(f) = (t — A)". 


9 . 31 . Find the characteristic polynomial A(f) and minimal polynomial m(t) of each matrix: 

, (b) M' = 


4 

1 

0 

0 

0 

0 

4 

1 

0 

0 

0 

0 

4 

0 

0 

0 

0 

0 

4 

1 

0 

0 

0 

0 

4 


(a) M = 


(a) M is block diagonal with diagonal blocks 


2 7 0 0 

0 2 0 0 

0 0 11 

0 0-24 


A = 


4 1 0 
0 4 1 
0 0 4 


and 


B = 


The characteristic and minimal polynomial of A is fit) = [t — 4) 3 and the characteristic and minimal 
polynomial of B is g(t) = (t — 4) 2 . Then 


A(t)=f(t)g(t) = (t-4) i 


but 


7(0=LCM[/(t),g(t)] = (t-4) 3 


(where LCM means least common multiple). We emphasize that the exponent in m{t) is the size of the 
largest block. 


(b) Here M' is block diagonal with diagonal blocks A' = 


2 7 
0 2 

,2 


and 


B' = 


1 1 

-2 4 


The character¬ 


istic and minimal polynomial of A' is f{t) = (t — 2) . The characteristic polynomial of B' is 
g(t ) = t 2 — 5t + 6 = {t — 2)(t — 3), which has distinct factors. Hence, g(t ) is also the minimal polynomial 
of B. Accordingly, 

A{t)=f{t)g{t) = {t-2) i {t- 3) but m(t) = LCM[/(f),g(f)] = (t-2) 2 (t-3) 

9 . 32 . Find a matrix A whose minimal polynomial is /(f) = f 3 — 8 1 2 + 5/ + 7. 


Simply let A = 


0 0 -7 
1 0 -5 
0 1 8 


, the companion matrix of/(f) [defined in Example 9.12(b)]. 


9 . 33 . Prove Theorem 9.15: The minimal polynomial m{t) of a matrix (linear operator) A divides every 
polynomial that has A as a zero. In particular (by the Cayley-Hamilton theorem), ni(t) divides the 
characteristic polynomial A(f) of A. 

Suppose//) is a polynomial for which/(A) = 0. By the division algorithm, there exist polynomials q{t) 
and r(f) for which /(f) = m{t)q{t) + r(f) and r(t) = 0 or deg r(f) < deg m(f). Substituting f = A in this 
equation, and using that/(A) = 0 and m(A) = 0, we obtain r(A) = 0. If r(f) / 0, then r(t) is a polynomial of 
degree less than m{t) that has A as a zero. This contradicts the definition of the minimal polynomial. Thus, 
r(f) = 0, and so/(f) = m(t)q(t); that is, m{i) divides/(f). 
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9 . 34 . Let m(t) be the minimal polynomial of an n-square matrix A. Prove that the characteristic 
polynomial A (?) of A divides [m(t)] n . 

Suppose m(t) = f + c l t r ~ l + • • • + <?,._/ + c r . Define matrices as follows: 



B 0 = I so 

B x = A + Cjl so 

f ?2 ■ =; A~ T CjA + c^I so 

I — Bq 

Cjl = Bj — A = Bj — AB 0 

c 2 I = B 2 — A(A + Cjl) = B 2 — AB 


B,._ j = A''~ l + CjA r ~ 2 + • • • + so 

c r _jl = B r _ j — AB r _ 2 

Then 


— AB r _ x — c r I — (A r + c x A' -|- • • • 

+ c,._jA + c r I) = c r I — m(A) = c,.I 

Set 

B(t) = r l B 0 + f~ 2 Bj 

+ • • • + tB r _ 2 + S,._! 

Then 


(tl — A)B(t) — (fB 0 + f ^B x + • • • + tB r _ x ) 

- (f-'ABo + r 1 AB ] + • • • + AB r _j) 


— fB 0 + f 1 (B l — AB 0 ) + f 2 ( B 2 — AB l ) + ••• + t(B,._ j — AB r _ 2 ) — AB l ._ l 

— t 1 I ^ I Cif ~ I -(“ Cy_\tl -{“ Cyl — 

Taking the determinant of both sides gives \tl — A[|fi(f)| = \m{t)I\ = [m(f)]". Because |.B(f)| is a polynomial, 
| tl — A\ divides [m(f)] n ; that is, the characteristic polynomial of A divides [m(t)] n . 


9 . 35 . Prove Theorem 9.16: The characteristic polynomial A(f) and the minimal polynomial m(t) of A 
have the same irreducible factors. 

Suppose//) is an irreducible polynomial. If/(f) divides m(t), then//) also divides A(f) [because m(t) 
divides A(r)]. On the other hand, if//) divides A/), then by Problem 9.34,//) also divides [m(t)] n . But//) 
is irreducible; hence,//) also divides m(f). Thus, m(f) and A/) have the same irreducible factors. 


9 . 36 . Prove Theorem 9.19: The minimal polynomial m(t) of a block diagonal matrix M with diagonal 
blocks Aj is equal to the least common multiple (LCM) of the minimal polynomials of the diagonal 
blocks Aj. 


We prove the theorem for the case r = 2. The general theorem follows easily by induction. Suppose 
A O' 


M = o b J , where A and B are square matrices. We need to show that the minimal polynomial m(t) of M 
is the LCM of the minimal polynomials g(t) and h(t) of A and B , respectively. 

m(A) 0 


Because m(t) is the minimal polynomial of M. m(M) = 


= 0, and m(A) = 0 and 


0 m(B) 

m(B) = 0. Because g(t) is the minimal polynomial of A, g(t) divides m{t). Similarly, h(t) divides m(t). 
Thus m(t ) is a multiple of g(t) and h{t). 


Now let/(f) be another multiple of g(t) and h(t). Then/(M) = 


m o 

. 0 f(B) 


0 0 
0 0 


= 0. But 


m(f) is the minimal polynomial of M; hence, m(t) divides/(f). Thus, m(t) is the LCM of g(t) and h(t). 


9 . 37 . Suppose m(t) = t r + a r _jt r 1 + ■ • • + a x t + a 0 is the minimal polynomial of an n-square matrix A. 
Prove the following: 

(a) A is nonsingular if and only if the constant term a 0 / 0. 

(b) If A is nonsingular, then A -1 is a polynomial in A of degree r — 1 < n. 

(a) The following are equivalent: (i) A is nonsingular, (ii) 0 is not a root of m(f), (iii) a 0 / 0. Thus, the 
statement is true. 
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(b) Because A is nonsingular, a 0 / 0 by (a). We have 

tn(A2) — A 1 T a r _ 1 A' x -t- • • • -t- a^A -j- ci^I — 0 
Thus, -(A r_1 + a r _ 1 A r ~ 2 + • • • + a x I)A = I 

Accordingly, A~ ] =- (A' _1 + a r _ l A r ~ 2 + • • • + a^I) 

a o 


SUPPLEMENTARY PROBLEMS 


Polynomials of Matrices 


9 . 38 . Let A = 


2 -3 
5 1 

g{t) = t 3 — 2 t 2 + t + 3. 


and B = 


1 2 
0 3 


. Find /(A), g(A), f(B), g(B ), where f(t) = 2r 2 — 5t + 6 and 


9 . 39 . Let A = 


9 . 40 . Let B — 


1 2 
0 1 


8 12 0 
0 8 12 
0 0 8 


. Find A 2 , A 3 , A", where n > 3, and A 1 . 

. Find a real matrix A such that B = A 3 . 


9 . 41 . For each matrix, find a polynomial having the following matrix as a root: 
(a) A = 






'1 

1 2' 

'2 5" 

1 -3 

,(b) B = 

'2 -3' 
7 -4 

,(c) C = 

1 

2 3 





2 

1 4 


9 . 42 . Let A be any square matrix and let/(7) be any polynomial. Prove (a) (P X AP)" = P l A"P. 

(b) f{P~ x AP) = P~ 1 f(A)P. (c) f(A T ) = [ f(A)] T . (d) If A is symmetric, then/(A) is symmetric. 

9 . 43 . Let M = diagfA,,..., A J be a block diagonal matrix, and let f(t) be any polynomial. Show that f(M) is block 
diagonal and f(M) = diag[/(A^,... ,/(A r )]. 

9 . 44 . Let M be a block triangular matrix with diagonal blocks A 1 ,..., A r , and let f{t) be any polynomial. Show that 
/(M) is also a block triangular matrix, with diagonal blocks/(Aj),... ,/(A,.). 


Eigenvalues and Eigenvectors 

9 . 45 . For each of the following matrices, find all eigenvalues and corresponding linearly independent eigen¬ 
vectors: 



'2 -3' 


2 4' 


'1 —4 

II 

2 -5 

,(b) B = 

-1 6 

,(c) C = 

3 -7 


When possible, find the nonsingular matrix P that diagonalizes the matrix. 

9 . 46 . Let A = . 

(a) Find eigenvalues and corresponding eigenvectors. 

(b) Find a nonsingular matrix P such that D = P X AP is diagonal. 

(c) Find A 8 and/(A) where fit) = t A — 5 1 3 + 7? 2 — 2t + 5. 

(d) Find a matrix B such that B 2 = A. 
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9 . 47 . Repeat Problem 9.46 for A 


5 6 

-2 -2 


9 . 48 . For each of the following matrices, find all eigenvalues and a maximum set S of linearly independent 
eigenvectors: 



"1 -3 3' 


'3 -1 r 


II 

3-5 3 

,(b) B = 

7 -5 1 

, (c) C = 


1 


6-6 2 



Which matrices can be diagonalized, and why? 


9 . 49 . Find: (a) 2 x 2 matrix A with eigenvalues X\ = 1 and A 2 = — 2 and corresponding eigenvectors v\ = (1, 2) 
and v 2 = (3, 7). 

(b) 3 x 3 matrix A with eigenvalues X\ = 1, X 2 = 2, A 2 = 3 and corresponding eigenvectors Vi = (1, 0, 1), 
v 2 = (1, 1, 2), V 3 = (1, 2, 4). [Hint: See Problem 2.19.] 


9 . 50 . Let A = 


a b 
c d 


be a real matrix. Find necessary and sufficient conditions on a,b,c,d so that A is 


diagonalizable—that is, so that A has two (real) linearly independent eigenvectors. 

9 . 51 . Show that matrices A and A T have the same eigenvalues. Give an example of a 2 x 2 matrix A where A and A r 
have different eigenvectors. 


9 . 52 . Suppose v is an eigenvector of linear operators F and G. Show that v is also an eigenvector of the linear 
operator kF + k'G, where k and k! are scalars. 

9 . 53 . Suppose v is an eigenvector of a linear operator T belonging to the eigenvalue X. Prove 

(a) For n > 0, v is an eigenvector of T" belonging to X n . 

(b) /(A) is an eigenvalue of f(T) for any polynomial/(f). 


9 . 54 . Suppose X ^ 0 is an eigenvalue of the composition FoGof linear operators F and G. Show that X is also an 
eigenvalue of the composition G o F. [Hint: Show that G(v) is an eigenvector of Go F. | 


9 . 55 . Let E: V —► V be a projection mapping; that is, E 2 = E. Show that E is diagonalizable and, in fact, can be 

’/,. O' 


represented by the diagonal matrix M = 


0 0 


where r is the rank of E. 


Diagonalizing Real Symmetric Matrices and Quadratic Forms 

9 . 56 . For each of the following symmetric matrices A, find an orthogonal matrix P and a diagonal matrix D such 
that D = P~ X AP: 


(a) A 


5 4 
4 -1 


(b) A 



(c) A 


7 3 
3 -1 


9 . 57 . For each of the following symmetric matrices B, find its eigenvalues, a maximal orthogonal set S of 
eigenvectors, and an orthogonal matrix P such that D = P 1 BP is diagonal: 


'0 

1 

r 


'2 

2 

4' 

1 

0 

1 

, (b) B = 

2 

5 

8 

1 

1 

0 


4 

8 

17 


9 . 58 . Using variables .s and f, find an orthogonal substitution that diagonalizes each of the following quadratic forms: 

(a) q(x,y) = 4x 2 + 8xv — lly 2 , (b) q(x,y ) = 2x 2 — 6 xy + lOy 2 

9 . 59 . For each of the following quadratic forms q(x , y, z), find an orthogonal substitution expressing x, y, z in terms 
of variables r, s , t, and find q(r, s , t): 

(a) q(x, y,z) = 5X 2 + 3y 2 + 12xz, (b) ^r(x, y, z) = 3x 2 - 4xy + 6y 2 + 2xz - 4yz + 3z 2 
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9 . 60 . Find a real 2x2 symmetric matrix A with eigenvalues: 

(a) 2=1 and 2 = 4 and eigenvector u = (1.1) belonging to 2 = 1; 

(b) 2 = 2 and 2=3 and eigenvector u = (1,2) belonging to 2 = 2. 

In each case, find a matrix B for which B 2 = A. 

Characteristic and Minimal Polynomials 

9 . 61 . Find the characteristic and minimal polynomials of each of the following matrices: 


3 

1 

-i] [3 

2 

-1' 

(a) A = 2 

4 

-2 , (b) B = 3 

8 

-3 

-1 

-1 

3 J L 3 

6 

-1 


9 . 62 . Find the characteristic and minimal polynomials of each of the following matrices: 

'2 5 0 0 0] [4-1000] [3 2 0 0 O' 

02000 12000 14000 

(a) A = 0 0 4 2 0, (b) B = 0 03 1 0 , (c) C = 003 10 

00350 00031 00130 

0 0 0 0 7j L° 0 0 0 3j L° 0 0 0 4 _ 

"1 10] [2 0 O' 

9 . 63 . Let A = 0 2 0 and B = 0 2 2. Show that A and B have different characteristic polynomials 

0 o lj [0 0 1_ 

(and so are not similar) but have the same minimal polynomial. Thus, nonsimilar matrices may have the same 
minimal polynomial. 

9 . 64 . Let A be an n-square matrix for which A k = 0 for some k > n. Show that A" = 0. 

9 . 65 . Show that a matrix A and its transpose A T have the same minimal polynomial. 

9 . 66 . Suppose/(f) is an irreducible monic polynomial for which/(A) = 0 for a matrix A. Show that/(f) is the 
minimal polynomial of A. 

9 . 67 . Show that A is a scalar matrix kl if and only if the minimal polynomial of A is m(t) = f — k. 

9 . 68 . Find a matrix A whose minimal polynomial is (a) f 3 — 5f 2 + 6f + 8, (b) f 4 — 5f 3 — 2f + It + 4. 

9 . 69 . Let/(f) and g(t) be monic polynomials (leading coefficient one) of minimal degree for which A is a root. 
Show/(f) = g(t). [Thus, the minimal polynomial of A is unique.] 


ANSWERS TO SUPPLEMENTARY PROBLEMS 


Notation: M 

— R ]1 ^2i 

.] denotes a matrix 

M with rows i?,. 



9 . 38 . 

m-- 

= [-26,-3; 5, 

II 

s. 

Co 

<N 

1 

-40,39; 

-65, 

-27], 



m- 

Os 

o' 

II 

g(B) = [3,12; 

0,15] 




9 . 39 . 

a 2 = 

[1,4; 0,1], 

II 

ON 

O 

A" = 

[1,2m; 

0.1], A” 1 = [1,-2; 

0, 1] 

9 . 40 . 

Let A 

= [2 ,a,b; 0,2, 

c; 0,0,2]. Set 5 

= A 3 and 

then a 

= l,b=-i C=1 
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9.41. Find AO): (a) t 2 +1 - 11, (b) r 2 + 2r + 13, (c) f 3 - It 1 + 6t - 1 

9.45. (a) 1=1 ,m = (3,1); 1 = —4, v = (1,2), (b) A = 4,u = (2, 1), 

(c) 1 = —1, w = (2,1); A = —5, u = (2, 3). Only A and C can be diagonalized; use P = [u, u]. 

9.46. (a) A = 1, m = (1,1); A = 4, v = (1, -2), 

(b) £ = [k, v], 

(c) /(A) = [3,1; 2,1], A 8 = [21 846,-21 845; -43 690,43 691], 

(d) S=g,-i; —1,|] 

9.47. (a) 1 = 1,m= (3,-2); 2 = 2, t> = (2,-1), (b) P=[u,v\, 

(c) /(A) = [2,-6; 2,9], A 8 = [1021,1530; -510,-764], 

(d) B = [-3 + 4^2, -6 + 6V2; 2-2^2, 4-3^2} 

9.48. (a) A = —2, u— (1,1,0), u = (1,0, — 1); 2 = 4, w = (1,1,2), 

(b) A = 2, u = (1,1,0); A = -4, v = (0,1,1), 

(c) A = 3,m = (1,1,0), v = (1,0, 1); A = 1, w = (2, =1,1). Only A and C can be diagonalized; use 

P = [w, V, w]. 

9.49. (a) [19,-9; 42,-20], (b) [1.1,0;-2,0,2;-4,-2,5] 

9.50. We need [—tr(A)]‘ — 4[det(A)] > 0 or (a — cl) 2 + 4bc > 0. 


9.51. 

A = 

[1,1; 

0,1] 





9.56. 

(a) 

P = 

[2,-1; 

1.2J/05, 

£>=[7,0; 

0,3], 



(b) 

P = 

[i,i; 

l,-l]/v/2, 

D = [3,0; 

0,5], 



(c) 

p = 

[3,-1; 

l,3]/x/l0, 

£>=[8,0; 

0,2] 


9.57. 

(a) 

A = 

— 1, u 

= (i-- i-o). 

v= (1,1,- 

c4 

II 

cT 

w= (1.1.1). 


(b) 

1 = 

1, u — 

= (2,1,-1), 

v = (2, —3,1); 1 = 22, 

W= (1,2,4); 


Normalize u, v,w, obtaining u , D, iv, and set P = [ft, z), vt>]. ( Remark: u and v are not unique.) 

9.58. (a) x = (4s + t)/y/l7, y = (—s + 4t)/vT7, q(s, t) = 5s z — I2t 2 , 

(b) x= (3s-t)/VW, y = (s + 3t)/s/lQ, q(s, t) = s 2 + lit 2 

9.59. (a) x = (3s + 2f)/\/l3, y = r, z = (2s — 3t)/y/l3, q(r,s,t) = 3r 2 + 9s 2 — 4t 2 , 

(b) x = 5 Ks + Lt, y = Jr + 2 Ks — 2Lt, z = 2 Jr — Ks — Lt, where J = l/\/5, K = 1/\/30, 

L = 1/\/6; ^(r, 5, t) = 2r 2 + 2i 2 + 8? 2 

9.60. (a) A = 1 [5, —3; -3,5], B = l[3,-1; -1,3], 

(b) A = i[14, —2; -2,11], B = i[v/2 + 4v/3,2v/2-2v/3; 2s/2 - 2^3,4v/2 + \/3] 

9.61. (a) A(t) = m(t) = (t — 2) 2 (t — 6), (b) A(f) = (f — 2) 2 (t — 6), m(t) = (f — 2)0 — 6) 

9.62. (a) A0) = 0-2) 3 0-7) 2 , m{t) = (t - 2) 2 (t - 1), 

(b) AO) = 0 - w(f) = o - 3) 3 , 

(c) A(t) = 0 — 2) 0 — 4) 0 — 5), m(f) = 0 — 2)0 — 4)0 — 5) 

9.68. Let A be the companion matrix [Example 9.12(b)] with last column: (a) [—8, —6,5] r , (b) [—4, —7,2,5] r 


9.69. Hint: A is a root of h(t) =f(t) — g(t), where h(t) = 0 or the degree of h(t) is less than the degree of/0). 



CHAPTER 10 



Canonical Forms 


10.1 Introduction 


Let T be a linear operator on a vector space of finite dimension. As seen in Chapter 6, T may not have a 
diagonal matrix representation. However, it is still possible to “simplify" the matrix representation of T in 
a number of ways. This is the main topic of this chapter. In particular, we obtain the primary 
decomposition theorem, and the triangular, Jordan, and rational canonical forms. 

We comment that the triangular and Jordan canonical forms exist for T if and only if the characteristic 
polynomial A (t) of T has all its roots in the base field K. This is always true if K is the complex field C but 
may not be true if K is the real field R. 

We also introduce the idea of a quotient space. This is a very powerful tool, and it will be used in the 
proof of the existence of the triangular and rational canonical forms. 


10.2 Triangular Form 


Let T be a linear operator on an 72 -dimensional vector space V. Suppose T can be represented by the 
triangular matrix 

a n a l2 ... a ln 

A _ a 22 a 2n 


Then the characteristic polynomial A (t) of 7’ is a product of linear factors; that is, 

A(f) = det (tl — A) — (t — a n )(t - a 21 ) •••(#- a m ) 

The converse is also true and is an important theorem (proved in Problem 10.28). 

THEOREM 10.1: Let T: V —> V be a linear operator whose characteristic polynomial factors into linear 
polynomials. Then there exists a basis of V in which T is represented by a triangular 
matrix. 

THEOREM 10.1: (Alternative Form) Let A be a square matrix whose characteristic polynomial 
factors into linear polynomials. Then A is similar to a triangular matrix—that is, there 
exists an invertible matrix P such that P~ l AP is triangular. 

We say that an operator T can be brought into triangular form if it can be represented by a triangular 
matrix. Note that in this case, the eigenvalues of T are precisely those entries appearing on the main 
diagonal. We give an application of this remark. 
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EXAMPLE 10.1 Let A be a square matrix over the complex field C. Suppose X is an eigenvalue of A 2 . Show that \fl 
or — \fl is an eigenvalue of A. 

By Theorem 10.1, A and A 2 are similar, respectively, to triangular matrices of the form 



>i * • 

* 



'n\ * ■ 

* 

B = 

92 ■ 

* 

and 

B 2 = 

f4 ■ 

* 



f^n _ 




A. 


Because similar matrices have the same eigenvalues, X = fij for some i. Hence, /q = \fl or = -VI is an 
eigenvalue of A. 

10.3 Invariance 


Let T\V —> V be linear. A subspace W of V is said to be invariant under T or T-invariant if T maps W into 
itself—that is, if v £ W implies £ W. In this case, T restricted to W defines a linear operator on W; 
that is, T induces a linear operator T:W — W defined by T(w) — T(w) for every w £ W. 

EXAMPLE 10.2 

(a) Let T: R 3 -> R 3 be the following linear operator, which rotates each vector v about the z-axis by an angle 9 
(shown in Fig. 10-1): 

T(x , y, z) — (x cos 9 — y sin 9, x sin 9 + y cos 9 , z) 



Observe that each vector w = (a,b, 0) in the xy-plane W remains in W under the mapping T\ hence, W is 
T-invariant. Observe also that the z-axis U is invariant under T. Furthermore, the restriction of T to W rotates each 
vector about the origin O, and the restriction of T to U is the identity mapping of U. 

(b) Nonzero eigenvectors of a linear operator T:V —> V may be characterized as generators of T-invariant 
one-dimensional subspaces. Suppose T(v ) = Xv, v ^ 0. Then W = { kv, k £ ^f}, the one-dimensional subspace 
generated by v, is invariant under T because 

T(kv ) — kT(v) — k(Xv) = klv £ W 

Conversely, suppose dim U = 1 and u ^ 0 spans U, and U is invariant under T. Then T(u) £ U and so T(m) is a 
multiple of u —that is, T(u ) = /m. Hence, u is an eigenvector of T. 

The next theorem (proved in Problem 10.3) gives us an important class of invariant subspaces. 

THEOREM 10.2: Let T:V — V be any linear operator, and let fit) be any polynomial. Then the kernel 
of f(T) is invariant under T. 

The notion of invariance is related to matrix representations (Problem 10.5) as follows. 
THEOREM 10.3: Suppose ITis an invariant subspace of T:V —> V. Then T has a block matrix representation 

, where A is a matrix representation of the restriction T of T to W. 


A B 
0 C 
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10.4 Invariant Direct-Sum Decompositions 


A vector space V is termed the direct sum of subspaces W\, , IT,., written 

v = Wi © w 2 ©... © w r 

if every vector v € V can be written uniquely in the form 
v = w l + w 2 + ■ ■ ■ + w r , with W; € Wj 
The following theorem (proved in Problem 10.7) holds. 


THEOREM 10.4: Suppose W l , W 2 ,.... ,W r are subspaces of V, and suppose 

B i = {vthi,w 12 ,...,w lni }, B r = {w r[ . w r2 ,.... w m } 

are bases of W t . 1T 2 . • • ■ ■ W r , respectively. Then V is the direct sum of the W, if and 
only if the union B = B x U ... U B r is a basis of V. 


Now suppose T:V - 
W U W 2 ,... ,W r ; that is, 

v = Wj © ... © w r 


V is linear and V is the direct sum of (nonzero) 7-invariant subspaces 


and T(Wj) C W t , i=l,...,r 


Let T t denote the restriction of T to IT,. Then T is said to be decomposable into the operators T l or T is said 
to be the direct sum of the T h written T = T l © ... © T r . Also, the subspaces IT,..... IT,, are said to 
reduce T or to form a T-invariant direct-sum decomposition of V. 

Consider the special case where two subspaces U and W reduce an operator T:V —> V; say dim U = 2 
and dim W = 3, and suppose {u l ,u 2 } and {w,,vv 2 , vv 3 } are bases of U and W, respectively. If T l and T 2 
denote the restrictions of T to U and W, respectively, then 


Z\(ki) = a n u x +a 12 u 2 
= on Mi +a 22 u 2 


T 2 (w 0 = b n w x + b v _w 2 + b l3 w 3 
T 2 (w 2 ) = b 2l w x + b 22 w 2 + b 2i w 3 
T 2 (w 3 ) — b 3 1 vv’i + b 32 w 2 T b 33 w 3 


Accordingly, the following matrices A,B,M are the matrix representations of 7), T 2 , T, respectively. 


A = 

«n a 2 1 

_ a l2 a 22 _ 

, B = 

'b n 

b\2 

u 

^21 

^22 

b n 

b?>2 

, M — 

'A 

0 

O' 

B 




_ b \2> 

b 22> 

b 33 J 





The block diagonal matrix M results from the fact that {ip , u 2 ,w l ,w 2 ,w 2> } is a basis of V (Theorem 10.4), 
and that T(u t ) — T l (u i ) and T(wj) = T 2 (wj). 

A generalization of the above argument gives us the following theorem. 


THEOREM 10.5: Suppose T:V —> V is linear and suppose V is the direct sum of T-invariant subspaces, 
say, Wi ,..., VT,.. If Aj is a matrix representation of the restriction of T to IT,, then T 
can be represented by the block diagonal matrix: 

M = diag(A l ,A 2 ,...,A r ) 


10.5 Primary Decomposition 


The following theorem shows that any operator T:V —> V is decomposable into operators whose minimum 
polynomials are powers of irreducible polynomials. This is the first step in obtaining a canonical form 
for T. 
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THEOREM 10.6: (Primary Decomposition Theorem) Let T:V —► V be a linear operator with minimal 
polynomial 

™{t)=m n 'f 2 {t) n2 ---fr{t) n ' 

where the f{t) are distinct monic irreducible polynomials. Then V is the direct sum 
of T-invariant subspaces IT,, W r . where W t is the kernel of ffT)"'. Moreover, 
(](()"’ is the minimal polynomial of the restriction of T to W r 

The above polynomials f(f) n ‘ are relatively prime. Therefore, the above fundamental theorem follows 
(Problem 10.11) from the next two theorems (proved in Problems 10.9 and 10.10, respectively). 

THEOREM 10.7: Suppose T:V —> V is linear, and suppose/(t) = g(t)h(t) are polynomials such that 
f(T) = 0 and g(t) and h(t) are relatively prime. Then V is the direct sum of the 
'/’-invariant subspace U and W, where U = Ker g(T) and W = Ker h(T). 

THEOREM 10.8: In Theorem 10.7, if /(/) is the minimal polynomial of T [and g(t) and hit) are 
monic], then git) and h(t) are the minimal polynomials of the restrictions of 7' to U 
and W, respectively. 

We will also use the primary decomposition theorem to prove the following useful characterization of 
diagonalizable operators (see Problem 10.12 for the proof). 

THEOREM 10.9: A linear operator T:V —> V is diagonalizable if and only if its minimal polynomial 
m(t) is a product of distinct linear polynomials. 

THEOREM 10.9: (Alternative Form) A matrix A is similar to a diagonal matrix if and only if its 
minimal polynomial is a product of distinct linear polynomials. 

EXAMPLE 10.3 Suppose A f I is a square matrix for which A 3 = I. Determine whether or not A is similar to a 
diagonal matrix if A is a matrix over: (i) the real field R, (ii) the complex field C. 

Because A 3 = /, A is a zero of the polynomial/(t) = t 3 — 1 = (t — l)(t 2 + t + 1). The minimal polynomial m(t) of 
A cannot be t — 1, because A f I. Hence, 

m(t) = t 2 + t + 1 or m(t) — 7 3 — 1 

Because neither polynomial is a product of linear polynomials over R, A is not diagonalizable over R. On the other 
hand, each of the polynomials is a product of distinct linear polynomials over C. Hence, A is diagonalizable 
over C. 


10.6 Nilpotent Operators 


A linear operator T:V —> V is termed nilpotent if T" = 0 for some positive integer n\ we call k the index of 
nilpotency of T if T k = 0 but T k 1 f 0. Analogously, a square matrix A is termed nilpotent if A" = 0 for 
some positive integer n, and of index k if A k = 0 but A k ~ l f 0. Clearly the minimum polynomial of a 
nilpotent operator (matrix) of index k is m( t) = t k \ hence, 0 is its only eigenvalue. 

EXAMPLE 10.4 The following two r-square matrices will be used throughout the chapter: 


0 

1 

0 

... 0 

0 

0 

0 

1 

... 0 

0 

0 

0 

0 

... 0 

1 

0 

0 

0 

... 0 

0 


X 

1 

0 

... 0 

0 

0 

X 

1 

... 0 

0 

0 

0 

0 

... X 

1 

0 

0 

0 

... 0 

X 


N = N(r ) 


and J(A) 
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The first matrix N, called a Jordan nilpotent block , consists of l’s above the diagonal (called the super- 
diagonal), and 0’s elsewhere. It is a nilpotent matrix of index r. (The matrix N of order 1 is just the lxl zero 
matrix [()].) 

The second matrix J{X), called a Jordan block belonging to the eigenvalue X, consists of 1’s on the diagonal, l’s 
on the superdiagonal, and 0’s elsewhere. Observe that 

J(X) = II + N 

In fact, we will prove that any linear operator T can be decomposed into operators, each of which is the sum of a 
scalar operator and a nilpotent operator. 

The following (proved in Problem 10.16) is a fundamental result on nilpotent operators. 


THEOREM 10.10: Let T: V — V be a nilpotent operator of index k. Then T has a block diagonal matrix 
representation in which each diagonal entry is a Jordan nilpotent block N. There is 
at least one N of order k, and all other N are of orders < k. The number of N of each 
possible order is uniquely determined by T. The total number of N of all orders is 
equal to the nullity of T. 

The proof of Theorem 10.10 shows that the number of N of order i is equal to 2 m i — m i+ , — m i _ l , 
where m i is the nullity of T'. 


10.7 Jordan Canonical Form 


An operator T can be put into Jordan canonical form if its characteristic and minimal polynomials factor 
into linear polynomials. This is always true if K is the complex field C. In any case, we can always extend 
the base field K to a field in which the characteristic and minimal polynomials do factor into linear factors; 
thus, in a broad sense, every operator has a Jordan canonical form. Analogously, every matrix is similar to 
a matrix in Jordan canonical form. 

The following theorem (proved in Problem 10.18) describes the Jordan canonical form J of a linear 
operator T. 


THEOREM 10.11: Let T.V —> V be a linear operator whose characteristic and minimal polynomials 
are, respectively, 

A (t) = (t - Xy)" 1 ■ ■ ■ (t - X r ) n ' and m{t) = (t - ■ ■ ■ (t - X r ) m ' 


where the 2, are distinct scalars. Then T has a block diagonal matrix representation 
J in which each diagonal entry is a Jordan block A- = J For each A, the 
corresponding J- have the following properties: 

(i) There is at least one 7 (/ - of order m,: all other ,/ /; are of order < m r 

(ii) The sum of the orders of the J- is n r 

(iii) The number of J (; equals the geometric multiplicity of 

(iv) The number of J tj of each possible order is uniquely determined by T. 


EXAMPLE 10.5 Suppose the characteristic and minimal polynomials of an operator T are, 
respectively. 


A(t) = (f — 2) 4 (t — 5) 3 and m(t) — {t ■— 2) 2 {t — 5) 3 
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Then the Jordan canonical form of T is one of the following block diagonal matrices: 


diag 


2 

0 


1 

2 ’ 


2 1 
0 2 ’ 


5 i 0] \ 

0 5 1 

0 0 5_| / 


or 


diag 


2 

0 


, [ 2 ], [ 2 ], 


5 1 0]\ 

0 5 1 

0 0 5 J / 


The first matrix occurs if T has two independent eigenvectors belonging to the eigenvalue 2; and the second matrix 
occurs if T has three independent eigenvectors belonging to the eigenvalue 2. 


10.8 Cyclic Subspaces 


Let T be a linear operator on a vector space V of finite dimension over K. Suppose v € V and v f 0. The 
set of all vectors of the form f{T)(v), where fit) ranges over all polynomials over K, is a '/’-invariant 
subspace of V called the T-cyclic subspace ofV generated by v; we denote it by Z(v. T) and denote the 
restriction of T to Z(v,T ) by T v . By Problem 10.56, we could equivalently define Z(v,T) as the 
intersection of all /-invariant subspaces of V containing v. 

Now consider the sequence 


v, T(v), T 2 (v ), T\v), ... 

of powers of T acting on v. Let k be the least integer such that T k ( v) is a linear combination of those 
vectors that precede it in the sequence, say, 

T k (v ) = — a k _xT k ~ l {v) — • • • — a\T(v) — a 0 v 

Then 

m v (t ) = t k T a ^_jT ■ ■ ■ T a^t T a^ 


is the unique monic polynomial of lowest degree for which m v (T){v) = 0. We call m v (t ) the 
T-annihilator of v and Z(v, T). 


The following theorem (proved in Problem 10.29) holds. 


THEOREM 10.12: Let Z(v,T), T v , mft) be defined as above. Then 

(i) The set {v, T(v ),..., T k ~ l (v)} is a basis of Z(v, T); hence, dim Z(v, T) = k. 

(ii) The minimal polynomial of T v is m f t). 

(iii) The matrix representation of T v in the above basis is just the companion 


matrix C(m v 

of 

m L 

(f); that is, 



'0 

0 

0 

... 0 

-a 0 


1 

0 

0 

... 0 

—a l 

C(m v ) = 

0 

1 

0 

... 0 

~ a 2 


0 

0 

0 

... 0 

~ a k-2 


_0 

0 

0 

... 1 

1 


10.9 Rational Canonical Form 


In this section, we present the rational canonical form for a linear operator T:V —> V. We emphasize that 
this form exists even when the minimal polynomial cannot be factored into linear polynomials. (Recall that 
this is not the case for the Jordan canonical form.) 
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LEMMA 10.13: Let T:V —> V be a linear operator whose minimal polynomial is /(?)", where fit) is a 

monic irreducible polynomial. Then V is the direct sum 

V = Z(vy,T) © ■ ■ ■ © Z(v r , T) 

of T-cyclic subspaces Z(v [ . T) with corresponding T-annihilators 

/(O" 1 , /(f)" 2 , • • •, /(f)" r , « = »i >n 2 >...>n r 

Any other decomposition of V into T-cyclic subspaces has the same number of 
components and the same set of T-annihilators. 


We emphasize that the above lemma (proved in Problem 10.31) does not say that the vectors v t or 
other T-cyclic subspaces Z(v h T) are uniquely determined by T, but it does say that the set of 
T-annihilators is uniquely determined by T. Thus, T has a unique block diagonal matrix representation: 

M — diag(Cj, C 2 ,..., C r ) 

where the C, are companion matrices. In fact, the C, are the companion matrices of the polynomials/(f)"' . 
Using the Primary Decomposition Theorem and Lemma 10.13, we obtain the following result. 

THEOREM 10.14: Let T:V —> V be a linear operator with minimal polynomial 

™(f)=/i(f) mi /2(fr---/,(fr 

where the /)(?) are distinct monic irreducible polynomials. Then T has a unique 
block diagonal matrix representation: 


WL diag(C n , • • • ? f/rj, • • •, f/i, f-'\ 2 ■ • • •, C sr ^) 

where the CT- are companion matrices. In particular, the C (/ are the companion 
matrices of the polynomials f(t) n ' j , where 

m l =n u >n n >---> n ln , ..., m s = n sl >n s2 > -> n srs 

The above matrix representation of T is called its rational canonical form. The polynomials/-)?)”' 7 are 
called the elementary divisors of T. 


EXAMPLE 10.6 Let V be a vector space of dimension 8 over the rational field Q, and let T be a linear operator on V 
whose minimal polynomial is 

m{t) =fi{t)f 2 {tf = ( t 4 - 41 3 + 6t 2 - 4t - 7)0 - 3) 2 


Thus, because dim V = 8, the characteristic polynomial A (?) =fi(t)f 2 (t) 4 . Also, the rational canonical form M of T 
must have one block the companion matrix of/,(?) and one block the companion matrix of/ 2 (?) 2 . There are two 
possibilities: 


(a) diag[C(? 4 — 4? 3 + 6? 2 — 4? — 7), C((?-3) 2 ), C((? — 3) 2 )] 

(b) diag[C(? 4 — 4? 3 + 6? 2 — At — 7), C((?-3) 2 ), C(? - 3), C(? - 3)] 
That is. 


( 

'0 

0 

0 

7' 


1 

0 

0 

4 


0 

1 

0 

-6 

V 

0 

0 

1 

4 


'0 -9' 
1 6 

? 

\ 

0 -9' 

1 6 

, (b) diag 

f 

'0 0 0 7' 
10 0 4 

0 10-6 





{ 

O 

o 



\ 

‘0 -9 

1 6 

, [3], [3] 


) 


10.10 Quotient Spaces 


Let V be a vector space over a field K and let VP be a subspace of V. If v is any vector in V, we write v + W 
for the set of sums v + w with w £ VP; that is, 

v+W={v+w:w£ W} 
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These sets are called the cosets of W in V. We show (Problem 10.22) that these cosets partition V into 
mutually disjoint subsets. 


EXAMPLE 10.7 Let W be the subspace of R 2 defined by 
W = {(fl, b):a = b}, 

that is, W is the line given by the equation x — y = 0. We can view 
v + W as a translation of the line obtained by adding the vector v 
to each point in W. As shown in Fig. 10-2, the coset v + W is also 
a line, and it is parallel to W. Thus, the cosets of W in R 2 are 
precisely all the lines parallel to W. 

In the following theorem, we use the cosets of a subspace 
W of a vector space V to define a new vector space; it is 
called the quotient space of V by W and is denoted by V/W. 



THEOREM 10.15: Let W be a subspace of a vector space over a field K. Then the cosets of Win V form 
a vector space over K with the following operations of addition and scalar 
multiplication: 

(i) (u + w) + (v + W) — (u + v) + W, (ii) k(u + W) — ku + W , where k € K 

We note that, in the proof of Theorem 10.15 (Problem 10.24), it is first necessary to show that the 
operations are well defined; that is, whenever u+W = u' + W and v + W — v 1 + W, then 

(i) (u + v) + W = (u 1 + v') + W and (ii) ku + W = ku' + W for any k £ K 
In the case of an invariant subspace, we have the following useful result (proved in Problem 10.27). 

THEOREM 10.16: Suppose Wis a subspace invariant under a linear operator T:V —> V. Then T induces 
a linear operator T on V/W defined by T(v + W) = T ( v) + W. Moreover, if 7' is a 
zero of any polynomial, then so is T. Thus, the minimal polynomial of T divides the 
minimal polynomial of T. 


SOLVED PROBLEMS 


Invariant Subspaces 

10 . 1 . Suppose T:V —> V is linear. Show that each of the following is invariant under T : 

(a) {0}, (b) V, (c) kernel of T, (d) image of T. 

(a) We have T( 0) = 0 G {0}; hence, {0} is invariant under T. 

(b) For every v G V , T{v) £ V; hence, V is invariant under T. 

(c) Let u G Ker T. Then T(u) = 0 G Ker T because the kernel of T is a subspace of V. Thus, Ker T is 
invariant under T. 

(d) Because T{y) G Im T for every v G V, it is certainly tme when wGlmL Hence, the image of T is 
invariant under T. 

10 . 2 . Suppose {Wi} is a collection of T-invariant subspaces of a vector space V. Show that the 
intersection W — fj, if) is also T -invariant. 

Suppose »G W; then v G Wj for every i. Because W t is T-invariant, T{v) G W t for every i. Thus, 
T(v) G W and so W is T-invariant. 
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10 . 3 . Prove Theorem 10.2: Let T:V — > V be linear. For any polynomial fit), the kernel of f(T) is 
invariant under T. 

Suppose v 6 Ker/(7’)—that is, f{T){v) = 0. We need to show that T{v) also belongs to the kernel of 
f{T) —that is,f(T)(T(v)) = (f(T) ° T)(v) = 0. Because f(t)t = we have f(T) ° T = T °f(T). Thus, 
as required, 

(AT) ° T)(v) = (T°f(T)){v) = T(f(T){v)) = T( 0) = 0 


10 . 4 . Find all invariant subspaces of A 


2 

1 


-5 

-2 


viewed as an operator on R 2 . 


By Problem 10.1, R 2 and {0} are invariant under A. Now if A has any other invariant subspace, it must 
be one-dimensional. However, the characteristic polynomial of A is 

A(f) = t 1 — tr(A) t + \A\ = t 2 + 1 

Hence, A has no eigenvalues (in R) and so A has no eigenvectors. But the one-dimensional invariant 
subspaces correspond to the eigenvectors; thus, R 2 and {0} are the only subspaces invariant under A. 


10 . 5 . Prove Theorem 10.3: Suppose W is '/'-invariant. Then T has a triangular block representation 
~A B 1 

, where A is the matrix representation of the restriction T of T to W. 


s {w u 

.., wf\ of W and extend it to a basis {w 1; . 

■>w r ,v u 

T(wr) 

= T(w 1 ) = a u w x + • • 

• + a Xr w r 


r(w 2 ) 

= T{w 2 ) = a 2x wi + • • 

■ + a 2r w r 


f(w r ) 

= T (w r ) = a r\ w \ + '' 

■ + a rr w r 



T(vi) = b n w l + • • 

• + b\ r w r + CjjVj + • 

■ + c u v s 


Avf) = b 21 w 1 + •• 

• + b 2r w r + c 2x v x + • 

■ + c 2s v s 


T(v s ) = b sl w l + •• 

• + b sr w r + c sl v x -\— 

^SS^S 


But the matrix of T in this basis is the transpose of the matrix of coefficients in the above system of 

fA B 


equations (Section 6.2). Therefore, it has the form 


0 C 


where A is the transpose of the matrix of 


coefficients for the obvious subsystem. By the same argument, A is the matrix of T relative to the basis {w,} 
of W. 


10 . 6 . Let T denote the restriction of an operator T to an invariant subspace W. Prove 

(a) For any polynomial/(/), f(T)(xv) =f(T)(w). 

(b) The minimal polynomial of T divides the minimal polynomial of T. 

(a) If fit) = 0 or if/(f) is a constant (i.e., of degree 1), then the result clearly holds. 

Assume deg / = n > 1 and that the result holds for polynomials of degree less than n. Suppose that 

fit) — Clnf 1 + Cl n _\t n 1 + • • • + Cl\t + Uq 

Then /(f) (w) = (a„f" + a n _\T"~ x -t-f a 0 I) (w) 

= (n n r-')(f(w)) + (a^f - 1 + • • • + a 0 I)iw) 

= Kr-')(T(w)) + ia^r - 1 + ■■■+a 0 I)iw) =/(T)(w) 

(b) Let m(f) denote the minimal polynomial of T. Then by (a), m(f)(tv) = m(T)(w) = 0(w) = 0 for 
every w 6 W; that is, T is a zero of the polynomial m(f). Hence, the minimal polynomial of T divides 
m(f). 
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Invariant Direct-Sum Decompositions 

10.7. Prove Theorem 10.4: Suppose W l , W 2 ,... ,W r are subspaces of V with respective bases 
B i = {w n ,w l 2 ,...,w lni }, B r = {w rl ,w r2 ,... ,w mr } 

Then V is the direct sum of the W f if and only if the union B = 1J ( />’, is a basis of V. 

Suppose B is a basis of V. Then, for any v e V, 

V=a n w n ^ - 'r a ln l w \n, + ••• + a rl W rl -f a m r w m r = w l + w 2 4-h w r 

where vv,- = a n w n + • • • + a in yv in , £ W t . We next show that such a sum is unique. Suppose 

v = w\ + w' 2 H-+ w', where vv' £ W ; 

Because {rv fI ,..., vv,-,,.} is a basis of W h w'j = b n w n + • • • + b in .w ini , and so 

v = b n w n +-b b ln w lni + • • • + b rl w rl H-f b m w mr 

Because B is a basis of V, a, ; = b for each i and each j. Hence, w,- = vv', and so the sum for v is unique. 

Accordingly, V is the direct sum of the Wj. 

Conversely, suppose V is the direct sum of the W- r Then for any v £ V, v = w l + • • • + w r , where 

vv, £ Wj. Because {} is a basis of W h each vv, is a linear combination of the w^., and so v is a linear 

combination of the elements of B. Thus, B spans V. We now show that B is linearly independent. Suppose 

a n w n + • • • + a ln w Ui + ■■■ + a r ]W rl + • • • + a m w mr = 0 

Note that a n w n H-+ a in w in . E W ( . We also have that 0 = 0 + 0 • • • 0 E W t . Because such a sum for 0 is 

unique, 

a n w n + • • • + a in yv in . = 0 for i = 1,..., r 

The independence of the bases {w^ } implies that all the a’s are 0. Thus, B is linearly independent and is a 
basis of V. 


10.8. Suppose T:V —* V is linear and suppose T — T x ®T 2 with respect to a '/-invariant direct-sum 

decomposition V = U © W Show that 

(a) m{t) is the least common multiple of m x (t) and m 2 (t), where m{t), m 2 (t ) are the 

minimum polynomials of T, T { . T 2 , respectively. 

(b) A(f) = A 1 (/)A 2 (/), where A(/),A 1 (/), A 2 {t) are the characteristic polynomials of T,T l ,T 2 , 
respectively. 

(a) By Problem 10.6, each of m x (t) and m 2 (t) divides m(t). Now suppose/(f) is a multiple of both m x (t) 
and m 2 (t), then/(7’ 1 )([/) = 0 and/(r 2 )(W) = 0. Let v £ V, then v = u + w with u £ U and w £ W. 
Now 

f(T)v =f(T)u +f(T)w =f(T l )u +f(T 2 )w = 0 + 0 = 0 


That is, T is a zero of/(f). Hence, m{t) divides/(f), and so m{t) is the least common multiple of m, (i) 
and m 2 (t). 

IA 0 

(b) By Theorem 10.5, T has a matrix representation M = 
of Tj and T 2 , respectively. Then, as required. 


0 B 


, where A and B are matrix representations 


A (f) = \tl — M\ 


tl -A 0 
0 tl -B 


tI-A\\tI-B\ = A 1 (f)A 2 (t) 


10 . 9 . Prove Theorem 10.7: Suppose T:V —> V is linear, and suppose/(/) — g(t)h(t) are polynomials 
such that f(T) = 0 and g(t) and h{t) are relatively prime. Then V is the direct sum of the 
'/’-invariant subspaces U and W where U = Ker g(T) and W — Ker h(T). 
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Note first that U and W are T-invariant by Theorem 10.2. Now, because g(t) and h(t) are relatively 
prime, there exist polynomials r(t) and s(t) such that 

r{t)g{t) + s{t)h{t) = 1 

Hence, for the operator T, r(T)g(T) + s(T)h(T) = I (*) 

Let v G V; then, by (*), v = r(T)g(T)v + s(T)h(T)v 

But the first term in this sum belongs to W = Ker h(T), because 

h(T)r(T)g(T)v = r(T)g(T)h(T)v = r(T)f(T)v = r(T) Ov = 0 

Similarly, the second term belongs to U. Hence, V is the sum of U and W. 

To prove that V = U © W, we must show that a sum v = u + w with u G U, w G W, is uniquely 
determined by v. Applying the operator r(T)g(T) to v = u + w and using g(T)u = 0, we obtain 
r(T)g(T)v = r(T)g(T)u + r(T)g(T)w = r(T)g(T)w 

Also, applying (*) to tv alone and using h(T)w = 0, we obtain 

tv = r(T)g(T)w + s(T)h(T)w = r(T)g(T)w 

Both of the above formulas give us w = r(T)g(T)v, and so tv is uniquely determined by v. Similarly u is 
uniquely determined by v. Hence, V = U © W, as required. 


10 . 10 . Prove Theorem 10.8: In Theorem 10.7 (Problem 10.9), if/(f) is the minimal polynomial of T (and 
g(t) and h(t') are monic), then g(t) is the minimal polynomial of the restriction 7', of 7' to U and 
h(t ) is the minimal polynomial of the restriction T 2 of T to W. 

Let m t (t) and m 2 (t) be the minimal polynomials of T l and T 2 , respectively. Note that g(T x ) = 0 and 
h(T 2 ) = 0 because U = Ker g(T) and W = Ker h(T). Thus, 

m x (t) divides g(t) and ni 2 (t) divides h(t) (1) 

By Problem 10.9,/(f) is the least common multiple of m/f) and m 2 (t). But m/f) and m 2 (t) are relatively 
prime because g(t) and h(t) are relatively prime. Accordingly, /(f) = m 1 (f)m 2 (f). We also have that 
/(f) = g(t)h(t). These two equations together with (1) and the fact that all the polynomials are monic imply 
that g(t) = f)q(f) and h(t) = m 2 (f), as required. 

10 . 11 . Prove the Primary Decomposition Theorem 10.6: Let T:V —> V be a linear operator with minimal 
polynomial 

where the/(f) are distinct monic irreducible polynomials. Then V is the direct sum of 7 -invariant 
subspaces W u ...,W r where W : is the kernel of f(T) n '. Moreover, is the minimal 

polynomial of the restriction of T to W r 

The proof is by induction on r. The case r = 1 is trivial. Suppose that the theorem has been proved for 
r — 1. By Theorem 10.7, we can write V as the direct sum of T-invariant subspaces W t and V t , where W 1 is 
the kernel of/(T)" 1 and where V l is the kernel of / 2 (T)" 2 • • ■f r {T)" r . By Theorem 10.8, the minimal 
polynomials of the restrictions of T to Wj and 1/ are/(f)" 1 and/ 2 (f)" 2 • • -/(f)" r , respectively. 

Denote the restriction of T to V l by T x . By the inductive hypothesis, V l is the direct sum of subspaces 
W 2 , ■ ■ ■ ,W r such that W t is the kernel of/(7\)"' and such that/(f)"' is the minimal polynomial for the 
restriction of T l to W But the kernel of f(T) n ‘, for / = 2, ..., r is necessarily contained in I/, because//)" 1 
divides/(f)" 2 • • -/.(f)" r . Thus, the kernel off(T) n ‘ is the same as the kernel of f(T l ) n ‘, which is W h Also, 
the restriction of T to W t is the same as the restriction of 7\ to W/ (for i = 2,..., r); hence,//)"' is also the 
minimal polynomial for the restriction of T to Wj. Thus, V = W l © W 2 © • • • © W r is the desired 
decomposition of T. 

10 . 12 . Prove Theorem 10.9: A linear operator T:V —► V has a diagonal matrix representation if and only 
if its minimal polynomal m{t) is a product of distinct linear polynomials. 
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Suppose m(f) is a product of distinct linear polynomials, say, 

m{t) = A 2 ) ■■■{t- A r ) 

where the A, are distinct scalars. By the Primary Decomposition Theorem, V is the direct sum of subspaces 
W x ,..., W r , where W x = Kerf7’ — A,-7). Thus, if v G W h then (T — A,-7)(t;) = 0 or T(y) = A t v. In other 
words, every vector in W t is an eigenvector belonging to the eigenvalue A,-. By Theorem 10.4, the union of 
bases for IT,, W r is a basis of V. This basis consists of eigenvectors, and so T is diagonalizable. 

Conversely, suppose T is diagonalizable (i.e., V has a basis consisting of eigenvectors of T). Let 
A i, A s be the distinct eigenvalues of T. Then the operator 

f(T) = (T — A 1 7)(7’ — A 2 7) ■ ■ ■ (T — k s I) 

maps each basis vector into 0. Thus, f(T) = 0, and hence, the minimal polynomial m(t) of T divides the 
polynomial 

f(t) = (t - A x )(t - A 2 ) • • • (t - A s 7) 

Accordingly, m(t) is a product of distinct linear polynomials. 


Nilpotent Operators, Jordan Canonical Form 

10.13. Let T:V be linear. Suppose, for v € V, T k (v) = 0 but r A-1 (u) ^ 0. Prove 

(a) The set S = {v,T(v),..., T k ^ (v )} is linearly independent. 

(b) The subspace W generated by 5 is 7’-invariant. 

(c) The restriction T of T to W is nilpotent of index k. 

(d) Relative to the basis {l' k 1 (?;)..... 7'(«), v} of W, the matrix of T is the ^-square Jordan 
nilpotent block N k of index k (see Example 10.5). 

(a) Suppose 

av + a x T(v) + a 2 T 2 (v) + -b a k _ x T k ~ l (v) = 0 (*) 

Applying T k ~ x to (*) and using T k (v) = 0, we obtain aT k ~ x {v) = 0; because T k ~ x (v) ^0, a = 0. 
Now applying T A “ 2 to (*) and using T k (v) = 0 and a = 0, we hind a x T k ~ l (v) = 0; hence, a x = 0. 
Next applying T k ~ 3 to (*) and using T k (v) = 0 and a = a j = 0, we obtain a 2 T k ~ x (v) = 0; hence, 
a 2 = 0. Continuing this process, we find that all the a’s are 0; hence, S is independent. 

(b) Let v G W. Then 

v = bv+ b x T(v) + b 2 T 2 (v) H-b b k _ x T k ~ l (v) 

Using T k {v) = 0, we have 

T(v) = bT{v) + b x T 2 {v) + • • • + b k _ 2 T k ~ x {v) G W 
Thus, W is T’-invariant. 

(c) By hypothesis, T k (v) = 0. Hence, for i = 0,..., k — 1, 

T k (V (v)) = T k+i {v) = 0 

That is, applying T k to each generator of W, we obtain 0; hence, T k = 0 and so T is nilpotent of index at 
most k. On the other hand, T k ~ l (v) = T k ~ x {v) yb 0; hence, T is nilpotent of index exactly k. 

(d) For the basis {T k ~ x {v), T k ~ 2 (v), ..., T(v), v} of W, 

f{T k ~ x (v)) = T k [y) = 0 
T(T k ~ 2 \v)) = T k ~ x {y) 

T(T k - 3 {v)) = T k ~ 2 (v) 


f(T(v)) = T 2 (v) 

t(v) = T(v) 


Hence, as required, the matrix of T in this basis is the A-square Jordan nilpotent block N k . 
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10.14. Let T:V —> V be linear. Let U = Ker V and W = Ker T ‘ 11 . Show that 

(a) UCW, (b) T(W) C U. 

(a) Suppose u G U = Ker P. Then P(u) = 0 and so P +l (u) = T{P{u)) = T{ 0) = 0. Thus, 

u G Ker H 1 = IT But this is true for every u G [/; hence, UCW. 

(b) Similarly, if w G IT = Ker P+\ then P +1 (w) = 0. Thus, P +l (w) = P{T(w)) = HO) = 0 and so 

T(W) C 17. 

10.15. Let 7': V be linear. Let X = Ker V 2 , 7 = Ker T'* 1 , Z = Ker V. Therefore (Problem 10.14), 

X C Y C Z. Suppose 

{«!, . . . , U r }, {Mj, . . . , U r , V U ..., wj, {«!, • • .,U r , 
are bases of X , y, Z, respectively. Show that 

is contained in 7 and is linearly independent. 

By Problem 10.14, T(Z) C 7, and hence S C Y. Now suppose S is linearly dependent. Then there 
exists a relation 

a x u x H-1- a,.u r + b 1 T(w x ) H-+ b,T(w t ) = 0 

where at least one coefficient is not zero. Furthermore, because {m,} is independent, at least one of the b k 
must be nonzero. Transposing, we find 

b\T(w i) + • • • + b t T(w t ) = — a l u l — ■ ■ ■ — a r u r G X = Ker H 2 

Hence, T'~ 2 (b l T(w l ) + - \-b t T(w t )) = 0 

Thus, T'~ l (b l w l + • • • + b t w t ) = 0, and so b x w x + • • • + b,w t G 7 = Ker P~ x 

Because {u h Vj} generates 7, we obtain a relation among the u h Vj, w k where one of the coefficients (i.e., one 
of the b k ) is not zero. This contradicts the fact that {u h Vj. w k } is independent. Hence, S must also be 
independent. 

10.16. Prove Theorem 10.10: Let T:V —» V bea nilpotent operator of index k. Then T has a unique block 
diagonal matrix representation consisting of Jordan nilpotent blocks N. There is at least one N of 
order k, and all other N are of orders < k. The total number of N of all orders is equal to the 
nullity of T. 

Suppose dim V = n. Let VT, = Ker T, W 2 = Ker T 2 ,..., W k = Ker T k . Let us set m i = dim W h for 
i = 1,... ,k. Because T is of index k , W k = V and W k _ x ^ V and so m k _ x < m k = n. By Problem 10.14, 

W x C W 2 C • • • C W k = V 

Thus, by induction, we can choose a basis {/Yj of V such that {iq,..., is a basis of W t . 

We now choose a new basis for V with respect to which T has the desired form. It will be convenient to 
label the members of this new basis by pairs of indices. We begin by setting 

v(l,k)= u„ t _ i+ 1 , v(2, k) = u mi _ l+2 , • • ■, v(m k - m k _ u k) = u mt 

and setting 

v(l,k— 1) = Tv(l,k), v(2, k— 1) = Tv(2, k), v(m k — m k _ l ,k— 1) = Tv(m k — m k _ l: k) 

By the preceding problem, 

S x = K ... x u mk l ,v{\,k- \),...,v(rn k -m k _ u k- 1)} 

is a linearly independent subset of W k _ j. We extend S x to a basis of W k _, by adjoining new elements (if 
necessary), which we denote by 

v(m k - m k _ x + \, k— 1), v(m k - m k _ x + 2, 1), ..., v(m k _ x - m k _ 2 , k - 1) 

Next we set 

v(l, k — 2) = Tv{\, k — 1), v(2,k — 2) = Tv(2,k— 1), ..., 

v(m k -1 - m k _ 2 , k — 2) = Tv(m k _ x - m k _ 2 , k - 1) 
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Again by the preceding problem, 

S 2 = { mi , . • ■, u mt s , v{l,k - 2 ), ..., v(m k _, - m k _ 2 ,k - 2 )} 

is a linearly independent subset of W k _ 2 , which we can extend to a basis of W k _ 2 by adjoining elements 
v(m k _ j - m k _ 2 + 1, k - 2), «(/»*_, - m k _ 2 + 2,k — 2), ..., u(m t _ 2 - m k _ 3 ,k - 2) 

Continuing in this manner, we get a new basis for Vj which for convenient reference we arrange as follows: 


v(l,k) 
v(l,k - 1), 

^ 1 

T 7 

-j* 

S S 

1 1 

■ •, v{m k _ i 

— m k - 2 1 k — 1) 



f(l, 1), 

1 1 

1 1 

■ •, v{m k _ i 

■ •, v{m k _ x 

— m k - 2 ,2 ) , 

— m k -2 > 1)! 

.., v(m 2 -m u 2 ) 

.., v(m 2 — wtj, 1), 



The bottom row forms a basis of W x , the bottom two rows form a basis of W 2 , and so forth. But what is 
important for us is that T maps each vector into the vector immediately below it in the table or into 0 if the 
vector is in the bottom row. That is, 

Now it is clear [see Problem 10.13(d)] that T will have the desired form if the v(i,j) are ordered 
lexicographically: beginning with v(l, 1) and moving up the first column to v(l,k), then jumping to v(2, 1) 
and moving up the second column as far as possible. 

Moreover, there will be exactly m k — m k _ ] diagonal entries of order k. Also, there will be 

(m k _ l — fn k _ 2 ) — (. m k — m k _,) = 2 m k _ x — m k — m k _ 2 diagonal entries of order k — 1 


2 m 2 — m, — m 3 diagonal entries of order 2 
2m j — m 2 diagonal entries of order 1 

as can be read off directly from the table. In particular, because the numbers m x ,..., m k are uniquely 
determined by T, the number of diagonal entries of each order is uniquely determined by T. Finally, the 
identity 

m \ = i m k ~ mk- 1) + (2m*_i - m k - m k _ 2 ) H-h (2m 2 - m, - m 3 ) + (2m, - m 2 ) 

shows that the nullity m, of T is the total number of diagonal entries of T. 


10 . 17 . Let A = 


' 0 

1 

1 

0 

r 


'0 

1 

1 

0 

O' 

0 

0 

1 

1 

1 


0 

0 

1 

1 

1 

0 

0 

0 

0 

0 

and B = 

0 

0 

0 

1 

1 

0 

0 

0 

0 

0 


0 

0 

0 

0 

0 

0 

0 

0 

0 

0 


0 

0 

0 

0 

0 


The reader can verify that A and B 


are both nilpotent of index 3; that is. A 3 = 0 but A 2 =/=■ 0, and B' = 0 but B 2 0. Find the 
nilpotent matrices M A and M B in canonical form that are similar to A and B, respectively. 


Because A and B are nilpotent of index 3, M A and M B must each contain a Jordan nilpotent block of 
order 3, and none greater then 3. Note that rank(A) = 2 and rank(5) = 3, so nullity(A) = 5 — 2 = 3 and 
nullity (/?) = 5 — 3 = 2. Thus, M A must contain three diagonal blocks, which must be one of order 3 and two 
of order 1; and M B must contain two diagonal blocks, which must be one of order 3 and one of order 2. 
Namely, 


'0 

1 

0 

0 

0' 



'0 

1 

0 

0 

o' 

0 

0 

1 

0 

0 



0 

0 

1 

0 

0 

0 

0 

0 

0 

0 

and 

m b = 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 



0 

0 

0 

0 

1 

0 

0 

0 

0 

0 



0 

0 

0 

0 

0 
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10.18. Prove Theorem 10.11 on the Jordan canonical form for an operator T. 

By the primary decomposition theorem, T is decomposable into operators T x ,...,T r \ that is, 
T = 7i © • • • © T r , where (f — l,-)'"' is the minimal polynomial of T r Thus, in particular, 

(^-vr = o, (?w r /r = o 

Set Nj = Tj — AjI. Then, for i = 1 ..... r, 

T i = Nj + kjl, where Nf = 0 

That is, Tj is the sum of the scalar operator kjl and a nilpotent operator N f , which is of index m ; - because 
(t — kj)™ is the minimal polynomial of Tj. 

Now, by Theorem 10.10 on nilpotent operators, we can choose a basis so that N t is in canonical form. In 
this basis, Tj = N t + kjl is represented by a block diagonal matrix M f whose diagonal entries are the matrices 
Jjj. The direct sum J of the matrices M t is in Jordan canonical form and, by Theorem 10.5, is a matrix 
representation of T. 

Last, we must show that the blocks i, ; satisfy the required properties. Property (i) follows from the fact 
that Nj is of index m,. Property (ii) is tme because T and J have the same characteristic polynomial. Property 
(in) is tme because the nullity of Nj = Tj — k t l is equal to the geometric multiplicity of the eigenvalue kj. 
Property (iv) follows from the fact that the 7) and hence the N t are uniquely determined by T. 


10 . 19 . Determine all possible Jordan canonical forms J for a linear operator T:V — > V whose 
characteristic polynomial A(f) = (t — 2) 5 and whose minimal polynomial m(t) = (t — 2) 2 . 

J must be a 5 x 5 matrix, because A(f) has degree 5, and all diagonal elements must be 2, because 2 is 
the only eigenvalue. Moreover, because the exponent of t — 2 in m(t) is 2, J must have one Jordan block of 
order 2, and the others must be of order 2 or 1. Thus, there are only two possibilities: 


J = diag 



J = diag 


2 1 
2 


, [ 2 ], [ 2 ], [ 2 ] 


10 . 20 . Determine all possible Jordan canonical forms for a linear operator T:V —> V whose characteristic 
polynomial A (t) — (t — 2) 3 (f — 5) 2 . In each case, find the minimal polynomial m(t). 

Because t — 2 has exponent 3 in A(/), 2 must appear three times on the diagonal. Similarly, 5 must 
appeal' twice. Thus, there are six possibilities: 

(a) diag I 


(c) diag 


'2 1 


C 1 

\ ( 

'2 1 

2 1 

5 

D 1 

5 

, (b) diag 

2 1 

2 



J V 

2 


2 1 
2 


[ 2 ], 


(e) diag [2], [2], [2], 


5 1 
5 

5 r 

5 


(d) diag 


2 1 
2 



(f) diag([2], [2], [2], [5], [5]) 


The exponent in the minimal polynomial m(t) is equal to the size of the largest block. Thus, 

(a) m(t) = (t — 2) 3 (f — 5) 2 , (b) m(t) = (t — 2) 3 (t — 5), (c) m(t) = (t — 2) 2 (t — 5) 2 , 

(d) rn(t) = (t — 2) 2 (t — 5), (e) m(t) = (t — 2)(t — 5) 2 , (f) m(t) = (t — 2)(t — 5) 


Quotient Space and Triangular Form 

10 . 21 . Let W be a subspace of a vector space V. Show that the following are equivalent: 

(i) u G v + W, (ii) u — v E W, (iii) v € u + W. 

Suppose u £ v + W. Then there exists w 0 e W such that u = v + w 0 . Hence, u — v = w 0 G W. Conver¬ 
sely, suppose u — u G W. Then u — v = w 0 where w 0 G W. Hence, u = v+w 0 Ev+W. Thus, (i) and (ii) are 
equivalent. 

We also have u — v £ W iff ^,(« — v) = v — u G W iffu G u + W. Thus, (ii) and (iii) are also 
equivalent. 
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10 . 22 . Prove the following: The cosets of W in V partition V into mutually disjoint sets. That is, 

(a) Any two cosets u + W and v + W are either identical or disjoint. 

(b) Each v G V belongs to a coset; in fact, v G v + W. 

Furthermore, u + W = v + W if and only if u — v G W, and so (y + w) + W — v + W for any 
w G W. 

Let v G V. Because 0 G W, we have » = « + 06 v+W, which proves (b). 

Now suppose the cosets u + W and v + W are not disjoint; say, the vector x belongs to both u + IT and 
v+W. Then u — x G W and x — v G W. The proof of (a) is complete if we show that u + W = v+W. Let 
u + w 0 be any element in the coset u+W. Because u — x, x — v, w 0 belongs to W, 

(u + vv 0 ) — v = (m — x) + (x — v) + w Q G W 

Thus, u + vv 0 G v+W, and hence the cost u+W is contained in the coset v+W. Similarly, v + W is 
contained in u+W, and so u + W = v + W. 

The last statement follows from the fact that u + W = v + W if and only if u G v + W, and, by Problem 
10.21, this is equivalent to u — v 6 W. 

10 . 23 . Let W be the solution space of the homogeneous equation 2x + 3 y + 4z = 0. Describe the cosets 
of W in R 3 . 

IT is a plane through the origin 0= (0,0,0), and the cosets of IT are the planes parallel to IT. 
Equivalently, the cosets of W are the solution sets of the family of equations 

2x + 3y + 4z = k, teR 

In fact, the coset v+W, where v = ( a,b,c ), is the solution set of the linear equation 

2x + 3 v + 4z = 2a + 3b + 4c or 2(x — a) + 3(y — b) + 4(z — c) = 0 

10 . 24 . Suppose IT is a subspace of a vector space V. Show that the operations in Theorem 10.15 are well 

defined; namely, show that if u+W — u' + W and v + W = v' + IT. then 

(a) (u + v) + W = (u' + v') + W and (b) ku + W = ku! + W for any k G K 

(a) Because u+W = u' + IT and v +W = if +W, both u — u! and v — if belong to IT. But then 
(u + v) — {u' + if) = (u — u') + (v — if) G W. Hence, ( u + v) + W = (if + if) + W. 

(b) Also, because u — if G W implies k(u — if) 6 W, then ku — ku' = k(u — u') G IT; accordingly, 
ku + W = ku' + W. 

10 . 25 . Let V be a vector space and IT a subspace of V. Show that the natural map ij: V —»• V/W, defined 
by t](v) — v + W, is linear. 

For any u, v 6 V and any k & K, we have 

n(u + v) = u + v+ W = u+ W+v+W= >/(«) + i)(v) 
and ij(kv) = kv + W = k(v + W) = ki](v) 

Accordingly, »; is linear. 

10.26. Let W be a subspace of a vector space V. Suppose {wq,..., w r } is a basis of IT and the set of 
cosets { i'|...., v s }, where v, — Vj + W, is a basis of the quotient space. Show that the set of 
vectors B = {rq,..., v s , wq,... ,w r } is a basis of V. Thus, dim V = dim IT + dimfT/lT). 

Suppose u E V. Because {vj} is a basis of V/W, 

u = u + W = a\V x + a 2 v 2 + • • • + a s v s 

Hence, u = a i v l + • • • + a s v s + w, where w G IT Since {vv ; } is a basis of IT, 

u = a-i «! + •••+ a s v s + b l w l + • • • + b r w r 
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Accordingly, B spans V. 

We now show that B is linearly independent. Suppose 

c x v x H- \-c s v s + d l w l -\ - \-d r w r = 0 (1) 

Then cq %+••• + c s v s = 0 = W 

Because {^} is independent, the c’s are all 0. Substituting into (1), we find d x w x + ■ ■ ■ + d r w r = 0. Because 
{w,} is independent, the d’s are all 0. Thus, B is linearly independent and therefore a basis of V. 

10 . 27 . Prove Theorem 10.16: Suppose IT is a subspace invariant under a linear operator T:V —» V. Then 
T induces a linear operator T on V/W defined by T( v + IT) = T(v) + W. Moreover, if T is a zero 
of any polynomial, then so is T. Thus, the minimal polynomial of T divides the minimal 
polynomial of T. 

We first show that T is well defined; that is, if u + W = v + W, then T(u + W ) = T(v + IT). If 
it + W = v + W, then u — v G W, and, as IT is T-invariant, T(u — v) = T(u) — T(v) G W. Accordingly, 

T(u + W) = T(u) + IT = T(v) + W = T(v+W) 

as required. 

We next show that T is linear. We have 

T((u + W) + (v + W)) = T(u + v + W) = T(u + v) + W = T(u ) + T(v) + IT 
= T(u) + IT + T(v) + IT = T{u +W) + T(v+ W) 

Furthermore, 

f(k(u + IT)) = T(ku + IT) = T(ku) + W = kT(u) + IT = k(T(u) + W) = kt{u + IT) 

Thus, T is linear. 

Now, for any coset u + W in V/W, 

T*{u + IT) = T 2 (u) + IT = T(T(u)) + W= T(T{u) + W) = T(T{u + IT)) = T 2 (u + IT) 

Hence, T 2 = T 2 . Similarly, T n = T n for any n. Thus, for any polynomial 

fit) = a ,f + • • • + « 0 = E a / 

IiX)iu + w) =f(T)(u) + IT = E OiT{u) + IT = E «,(?>) + W) 

= E + IT) = E + W) = (E aiT){u + IT) =f(T)(u + W) 

and so f(T) = f(T). Accordingly, if T is a root of/(f) then f(T) = 0 = IT =f(T); that is, T is also a root of 
/(f). The theorem is proved. 

10 . 28 . Prove Theorem 10.1: Let T:V —> V be a linear operator whose characteristic polynomial factors 
into linear polynomials. Then V has a basis in which T is represented by a triangular matrix. 

The proof is by induction on the dimension of V. If dim V = 1, then every matrix representation of T is 
a 1 x 1 matrix, which is triangular. 

Now suppose dim V = n > 1 and that the theorem holds for spaces of dimension less than n. Because 
the characteristic polynomial of T factors into linear polynomials, T has at least one eigenvalue and so at 
least one nonzero eigenvector v, say T(v) = a n v. Let IT be the one-dimensional subspace spanned by v. 
Set V = V/W. Then (Problem 10.26) dim V = dim V — dim IT = n — 1. Note also that IT is invariant 
under T. By Theorem 10.16, T induces a linear operator T on V whose minimal polynomial divides the 
minimal polynomial of T. Because the characteristic polynomial of T is a product of linear polynomials, 
so is its minimal polynomial, and hence, so are the minimal and characteristic polynomials of T. Thus, V 
and T satisfy the hypothesis of the theorem. Hence, by induction, there exists a basis {w 2 , • • ■, v n } of T 
such that 

T{lh) = «22^2 
T(v 3 ) =a 32 v 2 + a 33 v 3 


T (v„) = a n2 v„ + a n3 v 3 + • • • + a nn v n 
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Now let « 2 ,.. ., v„ be elements of V that belong to the cosets v 2 , ■ ■ - , v n , respectively. Then (v, v 2 , ■ ■ ■, tt„} 
is a basis of V (Problem 10.26). Because T(v 2 ) = a 22 v 2 , we have 

T(v 2 ) - a 22 v 22 = 0, and so T(v 2 ) - a 22 v 2 e W 

But W is spanned by v; hence, T(v 2 ) — a 22 v 2 is a multiple of v, say, 

T(v 2 ) - a 22 v 2 = a 21 v, and so T(v 2 ) = a 2l v + a 22 v 2 

Similarly, for i = 3,..., n 

T(vi) - a i2 v 2 -a 0 v 3 - a H v t G W, and so T{v,) = a n v + a i2 v 2 + • • • + 

Thus, 

T(v) = a n v 
T{v 2 ) = a 2l v+a 22 v 2 


T {v„) = a nX v + a n2 v 2 + • • • + a nn v, 
and hence the matrix of T in this basis is triangular. 


Cyclic Subspaces, Rational Canonical Form 


10.29. Prove Theorem 10.12: Let Z(v, T) be a '/’-cyclic subspace, T v the restriction of T to Z(v, T), and 

m v (t ) = t k + a k _\t k ~ } +•■■ + «{) the T-annihilator of v. Then, 

(i) The set {v. T(v), ... , T k ~ l (v)} is a basis of Z(v, T); hence, dim Z(v, T) = k. 

(ii) The minimal polynomial of T v is m v (t). 

(iii) The matrix of T v in the above basis is the companion matrix C = C(m v ) of m v (t) [which 
has l’s below the diagonal, the negative of the coefficients a 0 ,a l ,..., a t ._ , of m v (t ) in the 
last column, and 0’s elsewhere]. 

(i) By definition of m v (t), T k (v) is the first vector in the sequence v, T(v), T 2 (v),... that, is a linear 

combination of those vectors that precede it in the sequence; hence, the set B = {u, T{v), ... , is 

linearly independent. We now only have to show that Z( v, T) = L(B), the linear span of B. By the above, 

T k (v) G L(B). We prove by induction that T"(v) G L(B) for every n. Suppose n > k and 

T"~ x (v) G L(B) —that is, T"~ x (v) is a linear combination of v, ..., T k ~ x (v). Then 

T"(v) = T(T"~ l (v)) is a linear combination of T(v ),..., T k {v). But T k (y) G L(B); hence, 
T n (v) G L(B) for every n. Consequently, f(T)(v) G L{B) for any polynomial /(f). Thus, 

Z(v,T) = L{B), and so B is a basis, as claimed. 

(ii) Suppose m(t) = f + b s _ l t s ~ x + • • • + b 0 is the minimal polynomial of T v . Then, because v G Z(v, T), 

0 = m(T v )(v) = m{T)(v) = T s (v) + b s _ x T s ~ l (v) -\ -f b 0 v 


(iii) 


Thus, T s (v) is a linear combination of v, T(v ),..., T s 1 (v), and therefore k < s. However, m v (T) = 0 
and so m v (T v ) = 0. Then m(t) divides m v (t ), and so s < k. Accordingly, k = s and hence 
m v (t) = m(t). 

7» = T(v) 

T v {T{v)) = T 2 (v) 


T v {T k ~\v)) = T k ~\v) 

T v( Tk H v )) = T k (v) = -a 0 v-a l T(v)-a 2 T 2 (v) - a k _ l T k l (v) 


By definition, the matrix of T v in this basis is the tranpose of the matrix of coefficients of the above 
system of equations; hence, it is C, as required. 

10.30. Let T:V —> V be linear. Let IV he a '/’-invariant subspace of V and T the induced operator on V/W. 
Prove 

(a) The T-annihilator of v G V divides the minimal polynomial of T. 

(b) The T-annihilator of v G V/W divides the minimal polynomial of T. 
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(a) The T-annihilator of v G V is the minimal polynomial of the restriction of T to Z(v, T ); therefore, by 
Problem 10.6, it divides the minimal polynomial of T. 

(b) The T-annihilator of h G V/W divides the minimal polynomial of T, which divides the minimal 
polynomial of T by Theorem 10.16. 

Remark: In the case where the minimum polynomial of T is/(f)", where/(f) is a monic irreducible 
polynomial, then the T-annihilator of v G V and the T-annihilator of v G V/W are of the form/(f)” ! , where 
m < n. 

10.31. Prove Lemma 10.13: Let T:V — V be a linear operator whose minimal polynomial is /(f)”, where 
/(f) is a monic irreducible polynomial. Then V is the direct sum of T-cyclic subspaces 
Z, = Z(v;, T), / = I...., r. with corresponding T-annihilators 

n = «t > n 2 > • • • > n r 

Any other decomposition of V into the direct sum of T -cyclic subspaces has the same number of 
components and the same set of T-annihilators. 

The proof is by induction on the dimension of V. If dim V = 1, then V is T-cyclic and the lemma 
holds. Now suppose dim V > 1 and that the lemma holds for those vector spaces of dimension less than 
that of V. 

Because the minimal polynomial of T is/(f)", there exists v t G V such that/(T)" _1 (iq) / 0; hence, the 
T-annihilator of tq is/(f)". Let Zj = Z(v : , T) and recall that Z, is T-invariant. Let V = V /Z, and let T 
be the linear operator on V induced by T. By Theorem 10.16, the minimal polynomial of T divides/(f)"; 
hence, the hypothesis holds for V and T. Consequently, by induction, V is the direct sum of T-cyclic 
subspaces; say, 

V = Z(v 2 , f) © • • • ©Z(h r , f) 

where the corresponding T-annihilators are /(f)" 2 ,... ,/(f)"% n > n 2 > •• • > n r . 

We claim that there is a vector v 2 in the coset v 2 whose T-annihilator is/(f)" 2 , the T -annihilator of v 2 . 
Let w be any vector in v 2 . Then/(T)" 2 (w) 6 Z,. Hence, there exists a polynomial g(f) for which 

/(T)" 2 (w)=g(T)( t q) (1) 

Because/(f)" is the minimal polynomial of T, we have, by (1), 

0 =/(T)"(w) =/(T)"-" 2 g(T)( Wl ) 

But/(f)" is the T-annihilator of iq; hence,/(f)" divides/(f)" _ " 2 g(f), and so g(t) =f(t) ni h(t) for some 
polynomial h(t). We set 

v 2 = w-h(T)(v i) 

Because w — v 2 = h{T)(v/) G Z 1; v 2 also belongs to the coset v 2 . Thus, the T-annihilator of v 2 is a multiple 
of the T-annihilator of v 2 . On the other hand, by (1), 

f(T) ni (v 2 ) =/(T)"'(w - h(T)(vi)) =/(T)" 2 (w) - g(T)( Vl ) = 0 

Consequently, the T-annihilator of v 2 is /(f)" 2 , as claimed. 

Similarly, there exist vectors v 3 ,..., v r G V such that v t G v} and that the T-annihilator of v, is /(f)"', 
the T-annihilator of v\. We set 

Z 2 = Z(t) 2 , T), ..., Z r = Z(v r , T) 

Let d denote the degree of/(f), so that//)"' has degree dn t . Then, because//)" 1 is both the T-annihilator of 
Vj and the T-annihilator of vj, we know that 

T(v/),, T dn <~ 1 (tf,)} and (t/.T^),..., F*"- 1 ^)} 

are bases for Z(v h T) and Z(W h T), respectively, for i = 2,..., r. But V = Z(v 2 , T) © • • • © Z(W r , T); 
hence, 

{lb,.T dn '~\v r )} 
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is a basis for V. Therefore, by Problem 10.26 and the relation T'{v) = T’(v) (see Problem 10.27), 

^-\v 1 ),v 2 ,...,T^~\v 2 ),...,v r ,...,T dn '- 1 (v r )} 

is a basis for V. Thus, by Theorem 10.4, V = Z(v x , T) © • • • ©Z(i> r , T ), as required. 

It remains to show that the exponents n 1 ,...,n r are uniquely determined by T. Because 
d = degree of /(r), 

dim V = d(n x + ■ ■ ■ + n r ) and dim Z ; = dni = 1 ,r 

Also, if .s is any positive integer, then (Problem 10.59)/(r)*(Z ; ) is a cyclic subspace generated by f(T) s (Vj), 
and it has dimension d{n i — s) if «, > s and dimension 0 if n i < s. 

Now any vector v E V can be written uniquely in the form v = wq + • • • + w n where wy E Z,. Hence, 
any vector in f{T) s (V) can be written uniquely in the form 

m\v)=f{T)\ Wl ) + ---+f{T)\w r ) 
where f{T) s (w i ) Ef(T) s (Z i ). Let t be the integer, dependent on for which 

n x > s, ... , n, > s, n t+l > s 
Then f(T) s (V) =f(T)\Z x ) © • • • ©/(T) S (Z,) 

and so dim[/(r) J (y)] = d[(n x — s) +-P (n, — s)] (2) 

The numbers on the left of (2) are uniquely determined by T. Set s = n — 1, and (2) determines the number 
of ;?,■ equal to n. Next set s = n — 2, and (2) determines the number of n- t (if any) equal to n — 1. We repeat 
the process until we set s = 0 and determine the number of n i equal to 1. Thus, the are uniquely 
determined by T and V, and the lemma is proved. 


10.32. Let V be a seven-dimensional vector space over R, and let T:V —» V be a linear operator with 
minimal polynomial m(t) = (t 2 —21 + 5)(t — 3) 3 . Find all possible rational canonical forms M 
of T. 

Because dimV = 7, there are onl^ two possible characteristic polynomials, A x (t) = (t 2 — 2t + 5) 2 
(t — 3) 3 orA 1 (t)=(f 2 — 2t + 5) (t — 3) . Moreover, the sum of the orders of the companion matrices must 
add up to 7. Also, one companion matrix must be C(t 2 — 2f+5) and one must be C((t — 3) 3 ) = 
C(f 3 — 9 1 2 + 27 1 — 27). Thus, M must be one of the following block diagonal matrices: 


(a) diag 


0 -5 

1 2 ’ 


0 -5 
1 2 ’ 


0 0 27] \ 

1 0 -27 , 

0 ! 9 J / 


(b) 




0 0 27 

1 0 -27 

0 1 9 


0 

1 



(c) diag 


0 -5 

1 2 ’ 


0 0 27 

1 0 -27 

0 1 9 


[3], [3] 


Projections 

10.33. Suppose V = W x © ■ ■ ■ © W r . The projection of V into its subspace W k is the mapping E: V — V 
defined by E(v) = w k , where v = vtq + • • • + w r , w t € Wj. Show that (a) E is linear, (b) E 2 = E. 

(a) Because the sum v = w x + • • • + w r , w t G W is uniquely determined by v, the mapping E is well 
defined. Suppose, for u E V, u = w\ + • • • + vv), w' E W v Then 

v + u = (nq + yJ\ ) + • • • + (wy + w' r ) and kv = kw x + • • • + kw r , kw h w ; + w[ E W) 
are the unique sums corresponding to v + u and kv. Hence, 

E(v + u) = w k + w' k = E(v) + E(u) and E(kv) = kw k + kE(v ) 
and therefore E is linear. 
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(b) We have that 


w k — 0 + • • ■ + 0 + w k + 0 + • ■ ■ + 0 


is the unique sum corresponding to w k G W k ; hence, E(w k ) = w k . Then, for any v E V, 
E 2 (v) = E(E(v)) = E(w k ) = w k = E{v) 

Thus, E 2 = E, as required. 


10.34. Suppose E:V —> V is linear and E 2 = E. Show that (a) E(u) = u for any w £ Im£ (i.e., the 
restriction of E to its image is the identity mapping); (b) V is the direct sum of the image and 
kernel of E:V = Im E © Ker E; (c) E is the projection of V into Im E, its image. Thus, by the 
preceding problem, a linear mapping T:V —> V is a projection if and only if T 2 = T: this 
characterization of a projection is frequently used as its definition. 

(a) If it 6 Im E, then there exists v G V for which E(v) = u; hence, as required, 

E(u) = E(E(v)) = E 2 (v) = E(v) = u 

(b) Let v G V. We can write v in the form v = E(v) + v — E{v). Now E{v) G Im E and, because 

E(y — E(v)) = E(v) — E 2 (v) = E(v) — E(y) = 0 

v — E(v) G Ker E. Accordingly, V = Im E + Ker E. 

Now suppose w G Im E (T Ker E. By (/), E(w) = w because w G Im E. On the other hand, 
E(w) = 0 because w G Ker E Thus, w = 0, and so Im E ft Ker E = {0}. These two conditions 
imply that V is the direct sum of the image and kernel of E. 

(c) Let v € V and suppose v = u + w, where u G Im E and vv G Ker E. Note that E(u) = u by (i), and 
E(w) = 0 because w G Ker E. Hence, 

E(v) = E(u + w) = E{u) + E(w) = u + 0 = u 
That is, E is the projection of V into its image. 

10.35. Suppose V = U © W and suppose T:V —»• V is linear. Show that U and W are both T-invariant if 
and only if TE = ET, where E is the projection of V into U. 

Observe that E(v) G U for every v G V, and that (i) E(v) = v iff v G U, (ii) E(v) = 0 iff v G W. 
Suppose ET = TE. Let u G U. Because E{u) = u, 

T(u) = T(E(u)) = ( TE)(u) = (. ET)(u ) = E(T(u)) G U 

Hence, U is T-invariant. Now let w G W. Because E(w) = 0, 

E(T(w)) = (ET)(w) = ( TE)(w) = T(E(w)) = T{ 0) = 0, and so T{w) G W 
Hence, W is also T -invariant. 

Conversely, suppose U and W are both T -invariant. Let »GV and suppose v = u + w, where u G T 
and w G W. Then T(u) G U and T(w) G W\ hence, E(T(u)) = T(u) and E(T(w)) = 0. Thus, 

(ET)(v) = (. ET)(u + w ) = (ET)(u) + {ET)(w ) = E(T(u)) +E(T(w)) = T(u) 
and ( TE)(v) = ( TE)(u + w) = T(E(u + w)) = T(u) 

That is, (ET)(v) = ( TE)(v) for every v G V; therefore, ET = TE, as required. 


SUPPLEMENTARY PROBLEMS 


Invariant Subspaces 

10 . 36 . Suppose W is invariant under T:V —> V. Show that W is invariant under/(T) for any polynomial/(f). 


10 . 37 . Show that every subspace of V is invariant under 7 and 0 , the identity and zero operators. 
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10.38. Let VLbe invariant under Tp. V —> V and T 2 :V —> V. Prove W is also invariant under T, + T 2 and T, T 2 . 

10.39. Let T:V —» V be linear. Prove that any eigenspace, E A is J-invariant. 

10.40. Let V be a vector space of odd dimension (greater than 1) over the real field R. Show that any linear operator 
on V has an invariant subspace other than V or {0}. 

[2 -4] 2 2 

10.41. Determine the invariant subspace of A = viewed as a linear operator on (a) R“, (b) C . 

10.42. Suppose dim V = n. Show that T:V —► V has a triangular matrix representation if and only if there exist 
J-invariant subspaces VL, C W 2 C • • • C W n = V for which dim W k = k, k = 1 

Invariant Direct Sums 

10.43. The subspaces VP,..... IT,, are said to be independent if w, + • • • + w r = 0, w,- 6 W), implies that each 
w,- = 0. Show that span(VP ; ) = VF, © • • • © W r if and only if the W, are independent. [Here span(VF,-) denotes 
the linear span of the VP,-.] 

10.44. Show that V = W 1 © • • • © VP,, if and only if (i) V = span(VP ; ) and (ii) for k=l,2,...,r, 
W k n span(VP 1 ,..., W k _ u W k+u ..., W r ) = {0}. 

10.45. Show that span(VP,-) = VP, © • • • © VP,, if and only if dim [span(VP,-)] = dim VP, + ■ ■ • + dim VP r . 

10.46. Suppose the characteristic polynomial of T\V —> V is A (t) = fi(t) n 'f 2 (t) n2 ■ ■ -/ r (r)' v , where the fi(t) are 
distinct monic irreducible polynomials. Let V = VP, © • • • © VP,, be the primary decomposition of V into T- 
invariant subspaces. Show that is the characteristic polynomial of the restriction of T to VP,-. 

Nilpotent Operators 

10.47. Suppose T, and T 2 are nilpotent operators that commute (i.e., T{T 2 = T 2 T X ). Show that J, + T 2 and T{T 2 are 
also nilpotent. 

10.48. Suppose A is a supertriangular matrix (i.e., all entries on and below the main diagonal are 0). Show that A is 
nilpotent. 

10.49. Let V be the vector space of polynomials of degree < n. Show that the derivative operator on V is nilpotent 
of index n + 1. 

10.50. Show that any Jordan nilpotent block matrix N is similar to its transpose N T (the matrix with l’s below the 
diagonal and 0’s elsewhere). 

10.51. Show that two nilpotent matrices of order 3 are similar if and only if they have the same index of nilpotency. 
Show by example that the statement is not true for nilpotent matrices of order 4. 

Jordan Canonical Form 

10.52. Find all possible Jordan canonical forms for those matrices whose characteristic polynomial A(f) and 
minimal polynomial m(t) are as follows: 

(a) A(r) = (t — 2) 4 (r — 3) 2 , m(t) = (t — 2 ) 2 (t — 3) 2 , 

(b) A(f) = (t - 7) 5 , m(t) = (t — 7) 2 , (c) A(r) = (f-2) 7 , m(t) = (t - 2) 3 

10.53. Show that every complex matrix is similar to its transpose. (Hint: Use its Jordan canonical form.) 

10.54. Show that all n x n complex matrices A for which A" = 7 but A k ^ / for k < n are similar. 

10.55. Suppose A is a complex matrix with only real eigenvalues. Show that A is similar to a matrix with only real 
entries. 
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Cyclic Subspaces 

10.56. Suppose T:V —> V is lineal'. Prove that Z(v, T) is the intersection of all L-invariant subspaces containing v. 

10.57. Let/(f) and g(f) be the r-annihilators of u and v, respectively. Show that if/(f) and g(t) are relatively 
prime, then/(f)g(f) is the L-annihilator of u + v. 

10.58. Prove that Z(m, T) = Z(v, T) if and only if g(T)(u) = v where g(t) is relatively prime to the L-annihilator 
of u. 

10.59. Let W = Z(v, T), and suppose the L-annihilator of v is/(f)", where/(f) is a monic irreducible polynomial 
of degree cl. Show that f(T) s (W) is a cyclic subspace generated by f(T) s (v) and that it has dimension 
d(n — s) if n > s and dimension 0 if n < s. 

Rational Canonical Form 

10.60. Find all possible rational forms for a 6 x 6 matrix over R with minimal polynomial: 

(a) m(t) = (f 2 — 2f + 3)(f + l) 2 , (b) m(t) = (f — 2) 3 . 

10.61. Let A be a 4 x 4 matrix with minimal polynomial m(f) = (f 2 + l)(f 2 — 3). Find the rational canonical form 
for A if A is a matrix over (a) the rational field Q, (b) the real field R, (c) the complex field C. 

10.62. Find the rational canonical form for the four-square Jordan block with 1’s on the diagonal. 

10.63. Prove that the characteristic polynomial of an operator T:V —► V is a product of its elementary divisors. 

10.64. Prove that two 3x3 matrices with the same minimal and characteristic polynomials are similar. 

10.65. Let C(/(f)) denote the companion matrix to an arbitrary polynomial/(f). Show that/(f) is the characteristic 
polynomial of C(/(f)). 

Projections 

10.66. Suppose V = W 1 © • • • © VP r . Let E t denote the projection of V into W t . Prove (i) E t Ej = 0, i j; 
(ii)7 = E l + ---+E r . 

10.67. Let E 1 ,..., E r be linear operators on V such that 

(i) Ef = Ej (i.e., the E t are projections); (ii) E t Ej = 0, i ^ j ; (fii) / = E x -|-+ E r 

Prove that V = Im E { © • • • © Im E r . 

10.68. Suppose E: V —> V is a projection (i.e., E 2 = E). Prove that E has a matrix representation of the form 
0 P , where r is the rank of E and 1 r is the r-square identity matrix. 

10.69. Prove that any two projections of the same rank are similar. (Hint: Use the result of Problem 10.68.) 

10.70. Suppose E: V —> V is a projection. Prove 

(i) I — E is a projection and V = Im E 0 Im (I — E), (ii) I + E is invertible (if 1 + 1 ^ 0). 

Quotient Spaces 

10.71. Let VP be a subspace of V. Suppose the set of cosets { v j + VP, v 2 + W, ..., v n + VP} in V/VP is linearly 
independent. Show that the set of vectors (tq, v 2 , ■ ■ ■, v n } in V is also linearly independent. 

10.72. Let VP be a substance of V. Suppose the set of vectors (iq, u 2 ,..., u n } in V is linearly independent, and that 
L(uf) PI VP = {0}. Show that the set of cosets (rq + VP, ..., u n + VP} in V/VP is also linearly 
independent. 
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10.73. Suppose V = U © VP and that {mj, ..., u n } is a basis of U. Show that {iq + VP, ..., u n + VP} is a basis of 
the quotient spaces V /VP (Observe that no condition is placed on the dimensionality of V or VP.) 

10.74. Let VP be the solution space of the linear equation 

flj.iq + a 2 x 2 + ■ ■ ■ + a n x n = 0, a, £ K 

and let v = (b l , b 2 ,..., b n ) £ K". Prove that the coset v + W of VP in K" is the solution set of the linear 
equation 

fljV; + a 2 x 2 + • • ■ + a n x n = b, where b = a 1 b l + ■ —\- a n b„ 

10.75. Let V be the vector space of polynomials over R and let W be the subspace of polynomials divisible by t 4 
(i.e., of the form a 0 t 4 + rqr 5 + • ■ ■ + a n _ 4 t n ). Show that the quotient space V/W has dimension 4. 

10.76. Let U and W be subspaces of V such that W C U C V. Note that any coset u + W 7 of lt / i n U may also be 
viewed as a coset of W in V, because u £ U implies u £ V; hence, U/W is a subset of V/W. Prove that 
(i) U/W is a subspace of V/W, (ii) dim (V/W) — dim([//VP) = dim (V/U). 

10.77. Let U and VP be subspaces of V. Show that the cosets of U Ci W in V can be obtained by intersecting each of 
the cosets of U in V by each of the cosets of VP in V: 

v/(unw) = {(v+U) n ( 1 / + W):v, v' £ V} 

10.78. Let T:V —> V' be linear with kernel VP and image U. Show that the quotient 
space V/W is isomorphic to U under the mapping 9:V/W —> U defined by 
0(v+ VP) = T(v). Furthermore, show that T = i° 6° rj, where ij : V —* VjW is 
the natural mapping of V into V/W( i.e., r\{v) = v + W), and i:U^>V' is the 
inclusion mapping (i.e., i(u ) = u). (See diagram.) 



ANSWERS TO SUPPLEMENTARY PROBLEMS 


10.41. (a) R" and {0}, (b) C 2 , {0}, VP, = span(2, 1 — 2i), W 2 = span(2, 1+2 i) 


10.52. (a) diag 
(b) diag 


2 1 

2 

7 1 
7 


2 1 
2 

7 1 
7 


3 1 
3 


diag 


2 1 
2 


, [V] 


diag 


7 1 

7 


, [ 2 ]- [ 2 ], 

, [7], [7], [7]); 


3 1 
3 


(c) Let M k denote a Jordan block with 1 = 2 and order k. Then diag(M 3 ,M 2 ,M i ), diag (M 3 ,M 2 ,M 2 ), 
diag(M 3 , M 2 ,M x ,M l ), diag(M 3 ,M l ,M l ,M 1 ,M l ) 


10.60. Let A = 






0 

-3 

, B = 

0 

-1 

, c = 

1 

2 

1 

-2 






0 0 

1 0 -12 

0 1 6 


, D = 

'0 -4' 

1 4 

_ 



(a) diag(A,A,B),diag(A,5,B),diag(A,S,-1,-1); (b) diag(C, C), diag(C,L», 2), diag(C, 2,2,2) 


10.61. Let A = 


0 -1 

1 0 


B = 


0 3 
1 0 


(a) diag(A,5), (b) diag (A. ^3,-^3), (c) diag(i, -i, -s/3, ->/3) 


10.62. Companion matrix with the last column [—1 4 ,4J. 3 , —61 2 ,4A\ T 




































Linear Functionals 
and the Dual Space 


11.1 Introduction 


In this chapter, we study linear mappings from a vector space V into its field K of scalars. (Unless 
otherwise stated or implied, we view A" as a vector space over itself.) Naturally all the theorems and results 
for arbitrary mappings on V hold for this special case. However, we treat these mappings separately 
because of their fundamental importance and because the special relationship of V to K gives rise to new 
notions and results that do not apply in the general case. 

11.2 Linear Functionals and the Dual Space 


Let V be a vector space over a field K. A mapping (f>: V —> K is termed a linear functional (or linear form) 
if, for every u, v £ V and every a. b. £ K, 

4>(au + bv) = atj>{u ) + b(j>{v) 

In other words, a linear functional on V is a linear mapping from V into K. 

EXAMPLE 11.1 

(a) Let Uj-.K" —> K be the ith projection mapping; that is, nfa^a^, ... fl„) = a,-. Then n t is linear and so it is a linear 
functional on K". 

(b) Let V be the vector space of polynomials in t over R. Let J: V —> R be the integral operator defined by 
J(f>(f)) = Jo p(t) dt. Recall that J is linear; and hence, it is a linear functional on V. 

(c) Let V be the vector space of n-square matrices over K. Let T: V —> K be the trace mapping 

T(A) = a u +a 22 + ---+a m , where A — [a^ 

That is, T assigns to a matrix A the sum of its diagonal elements. This map is linear (Problem 11.24), and so it is a 
linear functional on V. 

By Theorem 5.10, the set of linear functionals on a vector space V over a field K is also a vector 
space over K, with addition and scalar multiplication defined by 

(4> + a)(v) = 4>(v) + a(v) and (k(j))(v) = kcj)(v) 

where (f> and a are linear functionals on V and k € K. This space is called the dual space of V and is 
denoted by V*. 

EXAMPLE 11.2 Let V = K", the vector space of n-tuples, which we write as column vectors. Then the dual space V* can 
be identified with the space of row vectors. In particular, any linear functional f = (a 1 ,..., a n ) in V* has the representation 

4>{x u x 2 , • • • ,x n ) = [a u a 2 ,.. .,a n ][x 2 ,x 2 ,... ,x„] T = + a 2 x 2 H-b a n x n 

Historically, the formal expression on the right was termed a linear form. 
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11.3 Dual Basis 


Suppose V is a vector space of dimension n over K. By Theorem 5.11, the dimension of the dual space V* 
is also n (because K is of dimension 1 over itself). In fact, each basis of V determines a basis of V* as 
follows (see Problem 11.3 for the proof). 


THEOREM 11.1: 


Suppose {v 1 ,. ..,v n } is a basis of V over K. Let </>j,..., £ V* be the linear 

functionals as debned by 


= S ij 


1 if 1 = j 
0 if i^j 


Then {(/> l5 ..., cp n } is a basis of V*. 


The above basis {</>,} is termed the basis dual to { v- } or the dual basis. The above formula, which 
uses the Kronecker delta 5y, is a short way of writing 

<M v t) = !> ( PiM=0, 0i(v 3 )=O, ..., 4>i(v„) = 0 
4 > 2<>i)=0, ^2(^2) ^ </>2(^3) = 0 ’ •••> 4 > 2 M = o 

<£«(Ti)= 0 > (Pniv i)=0, ...,0 n (v n _ 1 )=O, </> n (v n )= 1 

By Theorem 5.2, these linear mappings </> ; are unique and well defined. 

EXAMPLE 11.3 Consider the basis (rq = (2, 1), v 2 = (3, 1)} of R 2 . Find the dual basis {<j>i^4 > 2 }- 
We seek linear functionals cj> l (x,y) = ax + by and <f> 2 (x,y) = cx + dy such that 

</>i(ui) = 1, (Pi(v 2 ) = 0, 4> 2 (v 2 ) = 0, 4> 2 {.v 2 ) = 1 

These four conditions lead to the following two systems of linear equations: 

‘/'lK) = </>i (2,1) = 2a+ b = 1 1 4> 2 (vi) = 4 > 2 ( 2 , 1) = 2c + J = 0 1 

^ 1 (^ 2 ) = <^(3,1) = 3a + b = 0 J 4> 2 {v 2 ) = cj) 2 (3, 1) = 3c + d = 1 J 

The solutions yield a = —1, b = 3 and c = 1, d = —2. Hence, 0j (jc, v) = —x + 3y and (j> 2 (x,y) = x — 2 y form the 
dual basis. 


The next two theorems (proved in Problems 11.4 and 11.5, respectively) give relationships between 
bases and their duals. 

THEOREM 11.2: Let {Vf,, v„} be a basis of V and let {(p l ,..., </>„} be the dual basis in V*. Then 

(i) For any vector u £ V, u — cf) l (u)v l + </> 2 (m)u 2 + ■ ■ ■ + 4> n (u)v„. 

(ii) For any linear functional a £ V*, a = a(v l )(i> l + 0 ( 1 ’ 2 )(f> 2 + ■ ■ ■ + cr(u„)(/) ;r 

THEOREM 11.3: Let {v 1 ,... ,v n } and {w 1 ,...,w„} be bases of V and let {</>!,...,(/>„} and 
{ctj ,..., cr, 7 } be the bases of V* dual to {u,} and {w,}, respectively. Suppose P is 
the change-of-basis matrix from {z; ( } to {vt’ ( }. Then (P ' ) 7 is the change-of-basis 
matrix from {0,} to {cr,}. 


11.4 Second Dual Space 


We repeat: Every vector space V has a dual space V *, which consists of all the linear functionals on V. 
Thus, V* has a dual space V - **, called the second dual of V, which consists of all the linear functionals 
on V*. 

We now show that each v £ V determines a specific element v £ V**. First, for any (j) £ V*, we define 
v(4>) = (p(v) 
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It remains to be shown that this map v\V* —> K is linear. For any scalars a,b £ K and any linear 
functionals (f>, o £ V*, we have 

v(acj) + bo) — (a<p + bo)(v) — a<p(v) + bo(v) — av(<p) + bv(o) 

That is, v is linear and so v £ V**. The following theorem (proved in Problem 12.7) holds. 

THEOREM 11.4: If V has finite dimensions, then the mapping v i— > v is an isomorphism of V 
onto V**. 

The above mapping v i— > v is called the natural mapping of V into V**. We emphasize that this 
mapping is never onto V** if V is not finite-dimensional. However, it is always linear, and moreover, it is 
always one-to-one. 

Now suppose V does have finite dimension. By Theorem 11.4, the natural mapping determines an 
isomorphism between V and F**. Unless otherwise stated, we will identify V with V** by this 
mapping. Accordingly, we will view V as the space of linear functionals on V* and write V = V**. We 
remark that if {</>,} is the basis of V* dual to a basis {} of V, then {} is the basis of V** — V that is 
dual to {(/>,■}. 


11.5 Annihilators 


Let W be a subset (not necessarily a subspace) of a vector space V. A linear functional r/> G V* is called an 
annihilator of W if ()>(w) — 0 for every w £ W —that is, if 4>(W) — {0}. We show that the set of all such 
mappings, denoted by W° and called the annihilator of IT is a subspace of V*. Clearly, 0 £ W°. Now 
suppose <fi, a £ W°. Then, for any scalars a. b. £ K and for any vv £ W, 

(. a(f> + bo)(w) = a(f>(w) + bo(w) = aO + bO — 0 

Thus, acj) + bo £ W°, and so W° is a subspace of V*. 

In the case that W is a subspace of V, we have the following relationship between W and its annihilator 
W° (see Problem 11.11 for the proof). 

THEOREM 11.5: Suppose V has finite dimension and W is a subspace of V. Then 

(i) dim W + dim W° = dim V and (ii) VF 00 = W 

Here W 00 — {u £ V:cf>(v) = 0 for every </> £ VF 0 } or, equivalently, W m — (VF°)°, where VF (X) is viewed 
as a subspace of V under the identification of V and V**. 


11.6 Transpose of a Linear Mapping 


Let T:V —> U be an arbitrary linear mapping from a vector space V into a vector space U. Now for any 
linear functional (j) £ 17*, the composition (j)° T is a linear mapping from V into K: 


T <t> 

V -► U -► K 



That is, 4> oT £ V*. Thus, the correspondence 

cj) > (f> o T 

is a mapping from JJ* into V*; we denote it by V and call it the transpose of T. In other words, 
T r :U* —> V* is defined by 

^((p) = (p°T 

Thus, (T*((/>))(v) — 4>(T(v)) for every v £ V 
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THEOREM 11.6: The transpose mapping V defined above is linear. 

Proof. For any scalars a,b £ K and any linear functionals f.o E U*, 

Tfaf + bo) = (af + bo) ° T — a(cj) ° T) + b(o ° T) — aT r ((j)) + bT\o) 

That is, V is linear, as claimed. 

We emphasize that if T is a linear mapping from V into U, then V is a linear mapping from U* into V*. 
The same “transpose” for the mapping V no doubt derives from the following theorem (proved in 
Problem 11.16). 

THEOREM 11.7: Let T: V — U he linear, and let A be the matrix representation of T relative to bases 
{'«/} of V and {«,-} of U. Then the transpose matrix A r is the matrix representation of 
V\U* —> V* relative to the bases dual to {«,} and {}. 


SOLVED PROBLEMS 


Dual Spaces and Dual Bases 

11.1. Find the basis {</>j, (f) 2 . </> 3 } that is dual to the following basis of R : 

{ Wl = (1,—1,3), v 2 = (0,1,-1), v 3 = (0,3,—2)} 

The linear functionals may be expressed in the form 

fi(x,y,z) = a b x + a 2 y + a 3 z, f 2 (x,y,z) = b h x + b 2 y + b 3 z, f 3 (x,y,z) = d* + c 2 v + c 3 z 

By definition of the dual basis, ffvf) = 0 for i f j , but ffvf) = 1 for i = j. 

We find <f> l by setting = L = 0, fi(v 3 ) = 0. This yields 

4>f 1, —1, 3) = a 1 — a 2 + 3fl 3 = 1, 1. —1) = a 2 — a 3 = 0, 3, —2) = 3 a 2 — 2a 3 = 0 

Solving the system of equations yields a x = 1, a 2 = 0, a 3 = 0. Thus, <f> l (x,y,z) = x. 

We find (f> 2 by setting </> 2 ( t, i) = 0, = 1, <t> 2 {v 3 ) = 0. This yields 

0 2 (L-1,3) = 6i-6 2 + 3Z? 3 = O, 0 2 (0,1,-^1) =b 2 -b 3 = 1, 0 2 (0,3, —2) = 3b 2 - 2b 3 = 0 

Solving the system of equations yields b x =1 , b 2 = —2, a 3 = —3. Thus, <fi 2 (x,y,z) = lx — 2y — 3z. 

We find cj) 3 by setting 0 3 (ui) = 0, < j) 3 (v 2 ) = 0, 4> 3 (v 3 ) = 1. This yields 

<t > 3 {L -1,3) = d - c 2 + 3 c 3 = 0, f 3 (0, 1,-1) = c 2 - c 3 = 0, <j) 3 (0,3,-2) = 3c 2 - 2c 3 = 1 

Solving the system of equations yields q = —2, c 2 = 1, c 3 = 1. Thus, cf> 3 (x,y,z) = —2 x + y + Z- 


11 . 2 . 


Let V = {a + bt : a, b £ R}, the vector space of real polynomials of degree < 1. Find the basis 
{u 1; v 2 } of V that is dual to the basis (f> 2 } of V* defined by 

,1 ,2 


<Pi (/W) 


f(t)dt and (,b 2 (f(t)) = 


JO 

Let = a + bt and v 2 = c + dt. By definition of the dual basis, 

0iK) = 1, f i(v 2 )=0 and 4> 2 {vi) = 0, 


fit) dt 

0 


<Pi(Vj) = 1 


Thus, 

4>\i v \) = \lia + bt) dt = a+ 2 b = \ 1 

, } and 

4> 2 iv i) = J 0 (u + bt) dt = 2u + 2b = 0 J 

Solving each system yields a = 2, b = —2 and c = — 
the basis of V that is dual to <f> 2 }. 


(f> 1 i v 2 ) = J 0 ‘(c + dt) dt = c + \d = 0 1 
4> 2 i v 2 ) = J q 2 (c + dt) dt = 2c + 2d = 1 J 
d = 1. Thus, (q =2 — 2 1, v 2 = — 2 + t} is 
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11.3. Prove Theorem 11.1: Suppose {v^,... ,v n } is a basis of V over K. Let (p 1 ,...,(p n E V* be 
defined by <f>i{vj) = 0 for i ^ j, but <p;{vj) = 1 for i = j. Then {(p l ,... ,(p n } is a basis of V *. 

We first show that ..., <fi n } spans V*. Let cp be an arbitrary element of V*, and suppose 

<K V i) = k u (p(v 2 )=k 2 , ..., <p{v n ) = k n 

Set a = H-+ k n (p„. Then 

a ( v i) = (Mi + ••• + M»)(<h) = Mi(<h) + k 2 <t> 2 (v l ) + ••• + K<t> n {v\) 

— ki • 1 k.2 ■ 0 -f • • • -\- k n • 0 — 

Similarly, for i = 2,..., n, 

= (Mi + • • • + k n <j> n )(Vj ) = MiK) + • • • + k^iiv,) + • • • + M« (*>,■) = k t 

Thus, <j>{vp) = <j(Vj) for i=l,...,n. Because cp and a agree on the basis vectors, 

<p = a = k l <p x -\ -1 -k n (p n . Accordingly, {<p u ... ,<p n } spans V*. 

It remains to be shown that {cp l: ... ,(p n } is linearly independent. Suppose 

<Mi + a i4>2 +-f = 0 

Applying both sides to v t , we obtain 

0 = 0(vj) = (Mi + • • • + MJK) = Mi(*h) + M 2 K) + ''' + Mn(«i) 

— Cl y ' 1 ~h ^ 2 '0 ‘ ‘ ‘ H - @ n • 0 — Cl\ 

Similarly, for i = 2,..., n , 

0 = 0(v,.) = (Mi + • • • + a n <p n )(Vj) = MM) + ''' + + • • • + MM) = a t 

That is, £?!= 0, ...,a„ = 0. Hence, {cp l ,... ,(p n } is linearly independent, and so it is a basis of V*. 

11.4. Prove Theorem 11.2: Let {v 1: ..., v n } be a basis of V and let {(p l ,..., (p n } be the dual basis in 

V*. For any u E V and any a E V*, (i) u = JT (/> ( -(w)u ; . (ii) a = JT )(/>,-. 

Suppose 

« = a 1 » 1 +a 2 » 2 i- ka„v„ (1) 

Then 

<P i(m) = a l cp l (v l ) + MM) H-b MiK) = a, ■ 1 + a 2 • OH-b a„ • 0 = a x 

Similarly, for i = 2,..., n, 

<pi(u) = MM) + • • • + MM) + • • • + MM) = a t 

That is, cp , (u) = a l , cp 2 (u) = a 2 , <p n { u ) = a ,v Substituting these results into (1), we obtain (i). 

Next we prove (ii). Applying the linear functional a to both sides of (i), 

a{u) = 0 1 (n)ff(n 1 ) + cp 2 {u)a{v 2 ) + • • • + <p n (u)a(v„) 

= <M)<M“) + o(v 2 )<p 2 (u) + • • • + G(v n )<p n (u) 

= {cM<Pi + <r(v 2 )<l >2 + ''' + o(v„)<p„)(u) 

Because the above holds for every u E V, a = a(v l )(p 2 + <j(v 2 )cp 2 + • • • + a{v n )<p n , as claimed. 

11.5. Prove Theorem 11.3. Let {n,} and {vv,} be bases of V and let {</),} and {cr f -} be the respective 
dual bases in V*. Let P be the change-of-basis matrix from { v i } to {vv,}. Then (P i ) 1 is the 
change-of-basis matrix from {(/>,} to {rx,}. 

Suppose, for i = 1 

Wi = a n t)j + a i2 v 2 H-b a in v n and a t = b n (p l + b i2 (p 2 H-b a in v n 

Then P = [« J; ] and Q = [b^]. We seek to prove that Q = (P~ l ) T . 

Let Rt denote the /th row of Q and let C ; denote the /th column of P T . Then 

R i = (■ b n h a . b in) ^ C j = («;i, 0/2» • • •»“jnf 



356 


CHAPTER 11 Linear Functionals and the Dual Space 


By definition of the dual basis, 

= (&n0i + b a <t> 2 + • • • + b bl (j> n )(a n v i + a j2 v 2 + • • • + a jn v n ) 

= b ll a l] + b a a,j 2 H-1- b in cij„ = R,Cj = 5^ 

where d H is the Kronecker delta. Thus, 

y 

qp t = [RjCj\ = [<y = / 

Therefore, Q = ( P T )~ l = as claimed. 

11.6. Suppose v G V, v ^ 0, and dim V = n. Show that there exists 0 £ V* such that 0 (v) 0 0. 

We extend {z?} to a basis {«, v 2 , ■ ■ ■, v n } of V. By Theorem 5.2, there exists a unique linear mapping 
( p\V —> K such that (j>(v) = 1 and 0(« ; ) = 0, i = 2,... ,n. Hence, 0 has the desired property. 

11.7. Prove Theorem 11.4: Suppose dim V = n. Then the natural mapping n-> v is an isomorphism of 
V onto V**. 

We first prove that the map v hh> 0 is linear—that is, for any vectors v, w G V and any scalars a,b 6 K, 
av + bw = aii + bw. For any linear functional cf> 6 V*, 

av + bw(4>) = 4>(av + bw) = a<f>(v) + b<f>(w) = av(4>) + bw(<f>) = (av + bw)(4>) 

Because av + bw(<f>) = (av + bw)(4>) for every 4> £ we have av + bw = av + bw. Thus, the map 
v i—> v is linear. 

Now suppose » 6 f ti / 0. Then, by Problem 11.6, there exists (f> G V* for which cf>(v ) ^ 0. Hence, 
v(4>) = <f>(v) ^ 0, and thus v ^ 0. Because v ^ 0 implies v ^ 0, the map t; i—^ z) is nonsingular and hence 
an isomorphism (Theorem 5.64). 

Now dim V = dim V* = dim V **, because V has finite dimension. Accordingly, the mapping z; i—>■ z) 
is an isomorphism of V onto V**. 

Annihilators 

11.8. Show that if (f> £ V* annihilates a subset S of V, then (f) annihilates the linear span L(S ) of S. 
Hence, S° = [span(S)]°. 

Suppose v G span(S). Then there exists vtq,..., w r G S for which v = a x w x + a 2 w 2 + • • • + a r w r . 
4>(v) = a x 4>(w l ) + a 2 cj)(w 2 ) + • ■ ■ + a r (f>(w r ) = a x 0 + a 2 0 + • • • + a r 0 = 0 
Because v was an arbitrary element of span (S),(j) annihilates span(S), as claimed. 

11.9. Find a basis of the annihilator W° of the subspace W of R 4 spanned by 

= (1,2, —3,4) and v 2 = (0,1,4,-1) 

By Problem 11.8, it suffices to find a basis of the set of linear functionals <j) such that <f>(v j) = 0 and 
4>(v 2 ) = 0, where <f>(x l ,x 2 ,x 2 ,x 4 ) = ax x + bx 2 + cx 3 + dx 4 . Thus, 

0(1,2, — 3,4) = a + 2b — 3c + Ad = 0 and 0(0,1,4, — 1) = b + 4c — d = 0 

The system of two equations in the unknowns a, b , c, d is in echelon form with free variables c and d. 

(1) Set c = 1, d = 0 to obtain the solution a = 11,/? = —4, c = 1, d = 0. 

(2) Set c = 0, d = 1 to obtain the solution a = 6, b = — 1, c = 0, d = \. 

The linear functions 4>i( x i) = ll-H — 4.v 2 + x 3 and <t> 2 (Xi) = 6x x —x 2 +x 4 form a basis of W°. 

11.10. Show that (a) For any subset S of V. .S' C .S’ 00 , (b) If 5, C S 2 , then S° 2 C ,S' 0 . 

(a) Let v G S. Then for every linear functional 0 6 5°, 0(0) = 0( v) = 0. Hence, 0 G (5^)°. Therefore, 
under the identification of V and T**, v E S 00 . Accordingly, S C S 00 . 

(b) Let 0 G S)j. Then 0(w) = 0 for every v G S 2 . But S’! C S 2 ; hence, 0 annihilates every element of Sj 
(i.e., 0 G S 0 ). Therefore, C Sj. 
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11.11. Prove Theorem 11.5: Suppose V has finite dimension and VP is a subspace of V. Then 
(i) dim W + dim VF° = dim V, (ii) VP 00 = VP 

(i) Suppose dim V = n and dim VP = r < n. We want to show that dim VP° = n — r. We choose a basis 
{w 1; ... , w,.} of VP and extend it to a basis of V, say (vtq,... ,w r , iq ,..., v n _ r }. Consider the dual 
basis 

By definition of the dual basis, each of the above cr’s annihilates each vv ( ; hence, <j 1 ,... ,o n _ r G VP°. 
We claim that {tx ; } is a basis of VP°. Now {ffy} is part of a basis of V*, and so it is linearly independent. 
We next show that {<pj} spans VP°. Let a G VP°. By Theorem 11.2, 

a = a(w l )</>, + • • • + a(w r )cp r + a(v 1 )a l + • • • + a{v n _ r )a n _ r 
= 0(p 1 + • • • + 0cp r + a(v l )a l + • • • + o(v n _ r )o n _ r 
= <h>iK + ••• + c{v n _ r )o n _ r 

Consequently, {ctj, ... ,<r„_ r } spans VP° and so it is a basis of VP°. Accordingly, as required 

dim VP° = n — r = dim V — dim VP. 

(ii) Suppose dim V = n and dim VP = r. Then dim V* = n and, by (i), dim VP° = n — r. Thus, by (i), 
dim VP 00 = n — (n — r) = r; therefore, dim VP = dim VP 00 . By Problem 11.10, VP C VP 00 . Accordingly, 
W = W°°. 

11.12. Let U and VP be subspaces of V. Prove that (U + VP)° = f/° D VP°. 

Let <f> G (U + W)°. Then <f> annihilates U + W, and so, in particular, (f> annihilates U and VP That is, 

(p G U° and 0 G W°; hence, (p G U° Cl VP°. Thus, (U + IP) 0 C U° n IP 0 . 

On the other hand, suppose a G (7° fl VP°. Then a annihilates U and also VP If v G U + VP then 

v = u + w, where u G U and w G VP Hence, o(v) = a(u) + a(w) = 0 + 0 = 0. Thus, a annihilates U + VP 

that is, a G (U + VP)°. Accordingly, U° + VP° C (U + VP)°. 

The two inclusion relations together give us the desired equality. 

Remark: Observe that no dimension argument is employed in the proof; hence, the result holds for 
spaces of finite or infinite dimension. 

Transpose of a Linear Mapping 

11.13. Let (p be the linear functional on R 2 defined by (p(x,y ) = x — 2 y. For each of the following linear 
operators T on R 2 , find ( T r ((p))(x,y ): 

(a) T(x,y) = (x, 0), (b) T(x,y) = (y, x + y), (c) T(x,y ) = (2.x - 3y, 5x + 2 y) 

By definition, T'(<p) = <p°T\ that is, ( T'(<p))(v ) = <p(T(v)) for every v. Hence, 

(a) (P((p))(x,y) = (p{T(x,y )) = <p(x,0) =x 

(b) (T‘(<p))(x,y) = <p(T(x,y)) = <p(y, x + y) = y - 2{x + y) = -2x-y 

(c) (7 , '(0))(x, y) = (p(T{x,y)) = <p(2x - 3y, 5x + 2 y) = (2x - 3y) - 2(5x + 2 y) = -8x - ly 

11.14. Let T:V —> U be linear and let V : U* —>■ V* be its transpose. Show that the kernel of V is the 
annihilator of the image of T —that is, Ker V = (Im T)°. 

Suppose (p G Ker T'\ that is, T‘{(p) = <p ° T = 0. If u G Im T, then u = T(v) for some »Gf; hence, 
<p(u) = <p(T(v)) = (<p° T)(v) = 0(«) = 0 

We have that <p{u) = 0 for every u G Im T; hence, <p G (Im T)°. Thus, Ker V C (Im T)°. 

On the other hand, suppose a G (Im T)°; that is, er(Im T) = {0} . Then, for every v G V, 

(T r (( t))(v) = (a ° T)(v) = <t(T(v)) = 0 = 0(n) 
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We have (T , (o))(v) = 0(u) for every v E V; hence, V(c r) = 0. Thus, a G Ker V, and so 
(Im T)° C Ker V. 

The two inclusion relations together give us the required equality. 

11.15. Suppose V and U have finite dimension and T:V —> U is linear. Prove rank( T) = rank( V). 

Suppose dim V = n and dim U = m, and suppose rank(T) = r. By Theorem 11.5, 
dim(Im T)° — dim u — dim(Im T) = m — rank(T) = m — r 

By Problem 11.14, Ker T' = (Im T) 0 . Hence, nullity (T r ) = m — r. It then follows that, as claimed, 
rank(7’ / ) = dim U* — nullity(7’ , ) = m — (m — r) = r = rank(r) 

11.16. Prove Theorem 11.7: Let T: V —> U be linear and let A be the matrix representation of T in the 
bases {vj} of V and {«/} of U. Then the transpose matrix A 1 is the matrix representation of 
V-.U* —> V* in the bases dual to {«,} and { v- }. 

Suppose, for j = 1...., m, 

T( v j) = a ji u i + a j2 u 2 + • • • + a jn u n (1) 

We want to prove that, for i = 1,...,«, 

= «1;<A 1 + a 2i^2 4-f a mi4>m (2) 

where {ff,} and {(f>j} are the bases dual to {m,} and {i',}, respectively. 

Let v G V and suppose v = k l v l + k 2 v 2 + ■ ■ ■ + k m v m . Then, by (1), 

T(v) = kjivj + k 2 T(v 2 ) + • • • + k m T( vj 

= k x {a n U\ 4-4 a ln u n ) 4- k 2 (a 2 \U\ 4-h a 2n u„) 4--4 k m (a m iU l 4-4 a mn u n ) 

= (k x a n + k 2 a 2l 4-4 k m a ml )u l 4-4 {k x a ln + k 2 a 2n 4-4 k m a mn )u n 

n 

= E(Ml; + kl a 2i 4-b Kn a mi) U i 

7=1 

Hence, for j = \. n. 

(T‘((7j)(v)) = ffj(T(v)) = Oj (it(kia u + k 2 a 2i + • • • + k m a mi )u}j 

= k x a X j 4- k 2 a 2 j 4-4 k m a mj (3) 

On the other hand, for j = 1 

( a y<^i + a 2 j<t >2 + '' 1 + o m j<l> m )(v) = 4- a 2 j §2 4-4 a mj (l) m )(k l v 1 4- k 2 v 2 4-4 k m v m ) 

= k x a X j + k 1 a 2 j^ -4 k m a mj (4) 

Because v G V was arbitrary, (3) and (4) imply that 

T'(o>) = a xj 4> x + a 2j <t> 2 4-4 a mj (j) m , j=l,...,n 

which is (2). Thus, the theorem is proved. 


SUPPLEMENTARY PROBLEMS 


Dual Spaces and Dual Bases 

11.17. Find (a) 4> + (T, (b) 3 <j>, (c) 2 <f> — 5er, where 4>:R 3 R and cr:R 3 — » R are defined by 

cj)(x, y, z) = 2x - 3y + z and a(x, y, z) = 4x — 2y 4- 3z 

11.18. Find the dual basis of each of the following bases of R 3 : (a) {(1,0,0), (0,1,0), (0,0,1)}, 
(b) {(1.-2,3), (1,-1,1), (2,-4,7)}. 
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11.19. Let V be the vector space of polynomials over R of degree <2. Let <f> l , <j> 2 , 4> 3 be the linear functionals on V 
defined by 

<M/W)= <t> 2 (f(t)) =/'( 1 )> 03(/W) =/(°) 

Jo 

Here//) = a + bt + ct 1 E V and/'(f) denotes the derivative of/(f). Find the basis {/ 1 /),/>/),/ 3 (f)} of V 
that is dual to {<(>[, <j> 2 , <j> 3 }. 

11.20. Suppose b,# 6 V and that cf>(u) = 0 implies = 0 for all cf> E V*. Show that v = ku for some scalar k. 

11.21. Suppose cj),a E V * and that = 0 implies a{v) = 0 for all v E V. Show that a = kef) for some scalar k. 

11.22. Let V be the vector space of polynomials over K. For a E K, define —> K by 4> a (f{t)) = f(a). Show 

that (a) cj) a is linear; (b) if a ^ b, then (j) a / <j> b . 

11.23. Let V be the vector space of polynomials of degree < 2. Let a,b,c E K be distinct scalars. Let <f> a , (j> bl (f> c 
be the linear functionals defined by <(>„(/(f)) =f(a), (ftbifi 1 )) = /(^)> <^> c (/(f)) =/(c). Show that 
{4> a ,4> b ,4> c } is linearly independent, and find the basis {fi{t),f 2 (t),f 3 (t)} of V that is its dual. 

11.24. Let V be the vector space of square matrices of order n. Let T:V —* K be the trace mapping; that is, 
T(A) = a n + a 22 + •• • + a nn , where A = (a j; ). Show that T is linear. 

11.25. Let W be a subspace of V. For any linear functional (f> on W, show that there is a linear functional a on V 
such that <r(w) = for any w E W; that is, (f> is the restriction of cr to W. 

11.26. Let {e 1; ..., e n } be the usual basis of K". Show that the dual basis is {n l ,..., n n } where 71 ; is the ith 
projection mapping; that is, n f {a 1; ... ,a n ) = a ; . 

11.27. Let V be a vector space over R. Let (j) l , cj) 2 E V * and suppose a:V —*■ R, defined by a(v) = 4> l (v)4> 2 (v), 
also belongs to V*. Show that either /j = 0 or (f> 2 = 0. 

Annihilators 

11.28. Let W be the subspace of R 4 spanned by (1,2,—3,4), (1,3,—2, 6 ), (1,4, —1,8). Find a basis of the 

annihilator of W. 

11.29. Let W be the subspace of R 3 spanned by (1,1,0) and (0,1,1). Find a basis of the annihilator of W. 

11.30. Show that, for any subset S of V. span(S) = S 00 , where span(S) is the linear span of S. 

11.31. Let U and W be subspaces of a vector space V of finite dimension. Prove that (C/fl W)° = U° + VF°. 

11.32. Suppose V =U®W. Prove that V* = U° © W°. 

Transpose of a Linear Mapping 

11.33. Let <p be the linear functional on R 3 defined by <j)(x,y) = 3x — 2y. For each of the following linear 
mappings 7’: R —> R 2 , find (F((j>))(x,y,z): 

(a) T(x,y,z) = (x + y, y + z), (b) T(x,y,z) = (x + y + z, 2x - y) 

11.34. Suppose Ty-.U —» V and T 2 :V —> W are linear. Prove that ( T 2 ° T{f = T{° V 2 . 

11.35. Suppose T:V —» U is linear and V has finite dimension. Prove that Im V = (Ker T)°. 
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11.36. Suppose T:V —> U is linear and u E U. Prove that u Elm T or there exists <f> E V* such that T'^cj)) = 0 
and (f>(u) = 1 . 

11.37. Let V be of finite dimension. Show that the mapping T i—> V is an isomoiphism from Hom(P, V) onto 
Hom(y*, V*). (Here T is any linear operator on V.) 

Miscellaneous Problems 

11.38. Let V be a vector space over R. The line segment uv joining points u,v E V is defined by 
uv = {tu + (1 — t)v:0 < t < 1}.A subset S of V is convex if u,vES implies uv C S. Let (j) £ V*. Define 

W + = {v E V : cj>(v) > 0}, W = {v E V: <j)(v) = 0}, W~ = {v E V : <j>(v) < 0} 

Prove that W + , W, and VP" are convex. 

11.39. Let V be a vector space of finite dimension. A hyperplane H of V may be defined as the kernel of a 
nonzero linear functional (f> on V. Show that every subspace of V is the intersection of a finite number of 
hyperplanes. 


ANSWERS TO SUPPLEMENTARY PROBLEMS 


11.17. 

11.18. 
11.19. 
11 . 22 . 
11.23. 
11.28. 
11.29. 
11.33. 


(a) 6-v —5v + 4 z, (b) 6x—9y+3z, (c) — 16.r + 4v—13z 

(a) 4> l = x, <t> 2 =y, </>3 = z; (b) (j> 1 =-3x - 5y — 2z, (j >2 = 2x + y, (j) 3 =x + 2y + z 

fi(t) = 3t — |t 2 , f 2 (t) = -\t + lf-, f 3 (t) = 1 - 3f + \t 2 

(b) Let/(f) = t. Then 0 a (/(f)) = a^b = (f) b (f(t)); and therefore, (f> a ± (p h 

f t 2 - (b + c)t + be t 2 -(a + c)t + ac t 2 - (a + b)t + ab 

(a — b)(a — c) ’ Mt) ~ (b — a)(b — c) ’ Mt) ~ ( c-a)(c-b ) 

{<Pi{x,y,z,t) = 5 x-y + z, 4> 2 (x,y,z, t) = 2y - t} 

{<j>(x,y,z) = x-y + z} 

(a) (P(</)))(x,y,z) = 3x + y-2z, (b) (T'(<j>))(x,y, z) = -x + 5y + 3z 






Bilinear, Quadratic, 
and Hermitian Forms 


12.1 Introduction 


This chapter generalizes the notions of linear mappings and linear functionals. Specifically, we introduce 
the notion of a bilinear form. These bilinear maps also give rise to quadratic and Hermitian forms. 
Although quadratic forms were discussed previously, this chapter is treated independently of the previous 
results. 

Although the field K is arbitrary, we will later specialize to the cases K = R and K = C. Furthermore, 
we may sometimes need to divide by 2. In such cases, we must assume that 1 + 1^0, which is true when 
K = R or K = C. 


12.2 Bilinear Forms 


Let V be a vector space of finite dimension over a field K. A bilinear form on V is a mapping 
f:V x V —> K such that, for all a,b 6 K and all u t . v i € V: 

(i) f[au x + bu 2 , v ) = af(u x , v) + bf(u 2 , v), 

(ii) f(u, av x + bv 2 ) = af(u, v x ) + bf(u, v 2 ) 

We express condition (i) by saying / is linear in the first variable, and condition (ii) by saying / is linear in 
the second variable. 

EXAMPLE 12.1 

(a) Let/ be the dot product on R"; that is, for u = (a ; ) and v = ( b t ), 

f(u, v) — u ■ v — a x b x + a 2 b 2 + • ■ ■ + a n b n 

Then/ is a bilinear form on R". (In fact, any inner product on a real vector space V is a bilinear form on V.) 

(b) Let <f> and a be arbitrarily linear functionals on V. Let f:V x V —> K be defined by/(w, v) = f(u)o(v). Then/ is 
a bilinear form, because / and a are each linear. 

(c) Let A = [af be any n x n matrix over a field K. Then A may be identified with the following bilinear form F on 
K'\ where X = [jc ; ] and Y = [>',] are column vectors of variables: 

f(X, Y) = X T AY = X] a ij x iYi = a n x \y\ + a n x i x 2 + -H a nn x n y„ 

ij 

The above formal expression in the variables x ; ,y ; is termed the bilinear polynomial corresponding to the matrix 
A. Equation (12.1) shows that, in a certain sense, every bilinear form is of this type. 
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Space of Bilinear Forms 

Let B(V) denote the set of all bilinear forms on Vi A vector space structure is placed on B(V), where for 
any /, g £ B(V) and any k £ K, we define / + g and kf as follows: 

(f+g)(u,v)=f(u,v)+g(u,v) and (kf)(u,v) = kf(u,v) 

The following theorem (proved in Problem 12.4) applies. 

THEOREM 12.1: Let V be a vector space of dimension n over K. Let { tf >,...., (j > n } be any basis of the 
dual space V*. Then { f v . :i,j — 1,..., n } is a basis of B(V), where/■- is defined by 
fj(u, v ) = (j)j{u)(j)j{v ). Thus, in particular, dim B(V ) = n 2 . 

12.3 Bilinear Forms and Matrices 

Let / be a bilinear form on V and let S = {h 1; ..., u n } be a basis of V. Suppose u, v £ V and 
u = a l u l + • • • + a„u n and v — b 1 u 1 + • • • + b n u n 

Then 

f(u, v ) =/(a,M, + ■ • • + a n u n , Vh +-H K u n) = E a ,bjf(ui -«/) 

ij 

Thus,/' is completely determined by the n 2 values/(»,. 

The matrix A = [aJ where a { - =f(u r Uj) is called the matrix representation off relative to the basis S 
or, simply, the “matrix of / in S.” It “represents” / in the sense that, for all u. v £ V, 

f{u, v) = Y^ a ibjf(ui, Uj) = [M]^A[n] 5 (12.1) 

ij 

[As usual, [u] s denotes the coordinate (column) vector of u in the basis S.] 

Change of Basis, Congruent Matrices 

We now ask, how does a matrix representing a bilinear form transform when a new basis is selected? The 
answer is given in the following theorem (proved in Problem 12.5). 

THEOREM 12.2: Let P be a change-of-basis matrix from one basis S to another basis S'. If A is the 
matrix representing a bilinear form / in the original basis S, then B = P T AP is the 
matrix representing / in the new basis S'. 

The above theorem motivates the following definition. 

DEFINITION: A matrix B is congruent to a matrix A, written B A, if there exists a nonsingular 

matrix P such that B = P T AP. 

Thus, by Theorem 12.2, matrices representing the same bilinear form are congruent. We remark that 
congruent matrices have the same rank, because P and P T are nonsingular; hence, the following definition 
is well defined. 

DEFINITION: The rank of a bilinear form / on V, written rank(/), is the rank of any matrix 

representation of /. We say / is degenerate or nondegenerate according to whether 
rank (/) < dim V or rank( /) = dim V. 

12.4 Alternating Bilinear Forms 

Let / be a bilinear form on V. Then / is called 

(i) alternating if f(v, v) — 0 for every v £ V; 

(ii) skew-symmetric if f(u, v) = —f(v, u) for every u, v £ V. 
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Now suppose (i) is true. Then (ii) is true, because, for any u, v, € V, 

0 =f(u + V, U + V ) = /(«, u ) +f(u, V ) +f(v, u ) +f(v, V ) = /(«, V ) +f(v, u ) 

On the other hand, suppose (ii) is true and also 1 + 1^0. Then (i) is true, because, for every v £ V, we 
have /'( v, v ) = —/( v, v). In other words, alternating and skew-symmetric are equivalent when 1 + 1^0. 


The main structure theorem of alternating bilinear forms (proved in Problem 12.23) is as follows. 


THEOREM 12.3: Let/ be an alternating bilinear form on V. Then there exists a basis of V in which/ is 
represented by a block diagonal matrix M of the form 

[ 01 , [ 0 ], ... [ 0 ]) 

Moreover, the number of nonzero blocks is uniquely determined by / [because it is 
equal to j rank(/)]. 


M = diag 


In particular, the above theorem shows that any alternating bilinear form must have even rank. 


12.5 Symmetric Bilinear Forms, Quadratic Forms 


This section investigates the important notions of symmetric bilinear forms and quadratic forms and their 
representation by means of symmetric matrices. The only restriction on the field K is that I + I / 0. In 
Section 12.6, we will restrict K to be the real field R, which yields important special results. 

Symmetric Bilinear Forms 

Let / be a bilinear form on V. Then / is said to be symmetric if, for every u, v £ V. 

f(u, v ) =f(v,u ) 

One can easily show that / is symmetric if and only if any matrix representation A of / is a symmetric 
matrix. 

The main result for symmetric bilinear forms (proved in Problem 12.10) is as follows. (We emphasize 
that we are assuming that 1 + 1 / 0.) 

THEOREM 12.4: Let / be a symmetric bilinear form on V. Then V has a basis { v ,,..., v lt } in which / 
is represented by a diagonal matrix—that is, where/(n,-, vj) = 0 for i / j. 

THEOREM 12.4: (Alternative Form) Let A be a symmetric matrix over K. Then A is congruent to a 
diagonal matrix; that is, there exists a nonsingular matrix P such that P T AP is 
diagonal. 

Diagonalization Algorithm 

Recall that a nonsingular matrix P is a product of elementary matrices. Accordingly, one way of obtaining 
the diagonal form D = P T AP is by a sequence of elementary row operations and the same sequence of 
elementary column operations. This same sequence of elementary row operations on the identity matrix 1 
will yield P T . This algorithm is formalized below. 

ALGORITHM 12.1: (Congruence Diagonalization of a Symmetric Matrix) The input is a symmetric 
matrix A = [a«] of order n. 

Step 1. Form the n x 2 n (block) matrix M = [//./], where A , — A is the left half of M and the identity 
matrix I is the right half of M. 

Step 2. Examine the entry a n . There are three cases. 
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Case I: a n ^ 0. (Use a xl as a pivot to put 0’s below a n in M and to the right of a n in Aj.) 

For i — 2,...,n: 

(a) Apply the row operation “Replace R, by — a n R l + a,, R r ” 

(b) Apply the corresponding column operation “Replace C, by —a n C x + a n Cj.” 

These operations reduce the matrix M to the form 


M ~ 



0 

4i 


* * 

* * 


Case II: a n = 0 but a kk ^ 0, for some k > 1. 

(a) Apply the row operation “Interchange R x and R k .” 

(b) Apply the corresponding column operation "Interchange C, and C k .” 


(*) 


(These operations bring a kk into the first diagonal position, which reduces the matrix 
to Case I.) 

Case III: All diagonal entries a u = 0 but some a tj 0. 

(a) Apply the row operation “Replace R t by Rj + R t .” 

(b) Apply the corresponding column operation “Replace C, by C ; + C ( .” 

(These operations bring 2a y into the /th diagonal position, which reduces the matrix 
to Case II.) 


Thus, M is finally reduced to the form (*), where A 2 is a symmetric matrix of order less than A. 

Step 3. Repeat Step 2 with each new matrix A k (by neglecting the first row and column of the preceding 
matrix) until A is diagonalized. Then M is transformed into the form M' = [D, Q], where D is 
diagonal. 

Step 4. Set P = Q T . Then D = P T AP. 


Remark 1: We emphasize that in Step 2, the row operations will change both sides of M, but the 
column operations will only change the left half of M. 


Remark 2: 

fly 7 ^ 0 . 


The condition 1 + 1 ^ 0 is used in Case III, where we assume that 2 ay ^ 0 when 


The justification for the above algorithm appears in Problem 12.9. 


EXAMPLE 12.2 Let A = 


1 

2 

-3 


that D = P t AP is diagonal. 

First form the block matrix M 


M — [A, /] 


1 2 

2 5 

-3 -4 


2 

5 

-4 



. Apply Algorithm 9.1 to find a nonsingular matrix P such 


= [A,/]; that is, let 

-3 ' 1 0 O' 
-4 J 0 1 0 
8'001 


Apply the row operations “Replace R 2 by —2 R x + R 2 ” an d “Replace R x by 3 R x + R 3 ” to M, and then apply the 
corresponding column operations "Replace C 2 by —2C x + C 2 ” and “Replace C 3 by 3 C x + C 3 ” to obtain 


'1 

2 

- 3 

1 

0 

0" 


'1 

0 

0 

1 

0 

O' 

0 

1 

2 

-2 

1 

0 

and then 

0 

1 

2 

-2 

1 

0 

0 

2 

-1 1 

3 

0 

1 


0 

2 

-1 

3 

0 

1 
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Next apply the row operation “Replace R 3 by —2 R 2 + R 3 ” and then the corresponding column operation “Replace C 3 
by —2C 2 + C 3 ” to obtain 


'1 

0 

0 1 

1 

0 

0 " 


"1 

0 

0 

! i 

0 

O' 

0 

1 

2 

-2 

1 

0 

and then 

0 

1 

0 

i -2 

1 

0 

0 

0 

-5 | 

7 

-2 

1 


0 

0 

-5 

! 7 

-2 

1 


Now A has been diagonalized. Set 


'1 

-2 

T 



'1 

0 

O' 

0 

1 

-2 

and then 

D = P ] AP = 

0 

1 

0 

0 

0 

1 



0 

0 

-5 


We emphasize that P is the transpose of the right half of the final matrix. 

Quadratic Forms 

We begin with a definition. 

DEFINITION A: A mapping q:V —> K is a quadratic form if q(v) =/(?;, v) for some symmetric 
bilinear form f on V. 

If 1 + 1 / 0 in K, then the bilinear form/ can be obtained from the quadratic form q by the following 
polar form of/: 

/ 0 , v) = \ [q(u + v) - q(u) - q(v)\ 

Now suppose / is represented by a symmetric matrix A = [aJ, and 1 + 1/0. Letting X = [a/ denote 
a column vector of variables, q can be represented in the form 

q(x) = f(x,x) = x'ax = Y = Y ++? + 2 Y ";++/ 

ij i i<j 

The above formal expression in the variables x ; is also called a quadratic form. Namely, we have the 
following second definition. 

DEFINITION B: A quadratic form q in variables x 1 ,x 2 ,..., x n is a polynomial such that every term has 
degree two; that is, 

q( x 1 , x 2 , ■ ■ ■ ,x„) = Y c i x l + E d ij x i x j 

i i<j 

Using 1 + 1 / 0, the quadratic form q in Definition B determines a symmetric matrix A = [af where 
an = c i an d a ij — a ji — \ d ij. Thus, Definitions A and B are essentially the same. 

If the matrix representation A of q is diagonal, then q has the diagonal representation 

q(x) — X 1 AX = a n X{ + a 22 x 2 +■-•■■ + a nn X n 

That is, the quadratic polynomial representing q will contain no “cross product” terms. Moreover, by 
Theorem 12.4, every quadratic form has such a representation (when 1 + 1/0). 

12.6 Real Symmetric Bilinear Forms, Law of Inertia 

This section treats symmetric bilinear forms and quadratic forms on vector spaces V over the real field R. 
The special nature of R permits an independent theory. The main result (proved in Problem 12.14) is as 
follows. 

THEOREM 12.5: Let/ be a symmetric form on V over R. Then there exists a basis of V in which/ is 
represented by a diagonal matrix. Every other diagonal matrix representation of / has 
the same number p of positive entries and the same number n of negative entries. 
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The above result is sometimes called the Law of Inertia or Sylvester’s Theorem. The rank and signature 
of the symmetric bilinear form / are denoted and defined by 

rank( /) = p + n and sig( /) = p — n 

These are uniquely defined by Theorem 12.5. 

A real symmetric bilinear form / is said to be 

(i) positive definite if q(v) =f(v, v) > 0 for every v 0, 

(ii) nonnegative semidefinite if q(v) =f(v,v) >0 for every v. 

EXAMPLE 12.3 Let/ be the dot product on R". Recall that/ is a symmetric bilinear form on R". We note 
that / is also positive definite. That is, for any u = (af f 0 in R", 

f(u, u) = a\ + a\ + • • ■ + al > 0 

Section 12.5 and Chapter 13 tell us how to diagonalize a real quadratic form q or, equivalently, a real 
symmetric matrix A by means of an orthogonal transition matrix P. If P is merely nonsingular, then q can 
be represented in diagonal form with only l’s and — l’s as nonzero coefficients. Namely, we have the 
following corollary. 

COROLLARY 12.6: Any real quadratic form q has a unique representation in the form 

q(x l ,x 2 ,...,x n ) = x\~\ -h*p— 4+i- 

where r = p + n is the rank of the form. 

COROLLARY 12.6: (Alternative Form) Any real symmetric matrix A is congruent to the unique 
diagonal matrix 

D = diag(/ p , —7 n , 0) 

where r = p + n is the rank of A. 


12.7 Hermitian Forms 


Let V be a vector space of finite dimension over the complex field C. A Hermitian form on V is a mapping 
/: V x V —* C such that, for all a,b £ C and all n ; , v € V, 

(i) f(au l + bu 2 , v) = af(ui , v) + bf(u 2 , v), 

(ii) f{u, v ) =f(v,u). 

(As usual, k denotes the complex conjugate of k G C.) 

Using (i) and (ii), we get 

f(u, av { + bv 2 ) =f(av l + bv 2 , u) = af(v u u) + bf(v 2 , u) 

= af(vi,u) + bf(v 2 , u ) = af(u, v t ) + bf(u, v 2 ) 

That is, 

(iii) /(«, av { + bv 2 ) = af(u, vj + bf(u, v 2 ). 

As before, we express condition (i) by saying / is linear in the first variable. On the other hand, we express 
condition (iii) by saying / is “conjugate linear” in the second variable. Moreover, condition (ii) tells us that 
f(v, v ) =f(v, v), and hence, /(n, v ) is real for every v € V. 

The results of Sections 12.5 and 12.6 for symmetric forms have their analogues for Hermitian forms. 
Thus, the mapping q:V —> C, defined by q(v) —f(v, v), is called the Hermitian quadratic form or complex 
quadratic form associated with the Hermitian form/. We can obtain / from q by the polar form 

f(u, v) — \ [q(u + v ) - q(u - w)] + \ [q(u + iv) - q[u - iv)\ 
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Now suppose S = { u ,...., u n } is a basis of V. The matrix H = [/?.-] where hj. = f(u n U :) is called the 
matrix representation of / in the basis S. By (ii), /(«,-, u-) = f(u r m ( ); hence, H is Hermitian and, in 
particular, the diagonal entries of H are real. Thus, any diagonal representation of / contains only real 
entries. 

The next theorem (to be proved in Problem 12.47) is the complex analog of Theorem 12.5 on real 
symmetric bilinear forms. 

THEOREM 12.7: Let/ be a Hermitian form on V over C. Then there exists a basis of V in which / is 
represented by a diagonal matrix. Every other diagonal matrix representation of / 
has the same number p of positive entries and the same number n of negative 
entries. 

Again the rank and signature of the Hermitian form / are denoted and defined by 

rank( /) = p + n and sig( /) = p — n 

These are uniquely defined by Theorem 12.7. 

Analogously, a Hermitian form / is said to be 

(i) positive definite if q(v) —f(v, v) > 0 for every v / 0, 

(ii) nonnegative semidefinite if q(v) =f{v, v) > 0 for every v. 

EXAMPLE 12.4 Let / be the dot product on C"; that is, for any u = ( z t ) and v = (w,) in C", 
f(u, v) — u • v — Z\VV\ T Z 2 W 2 H-b z„w n 

Then / is a Hermitian form on C". Moreover, / is also positive definite, because, for any u = (z ; ) / 0 in C", 
f(u,u)=ZiZi+Z 2 Z 2 + --'+Z„Z n = \Zi\ 4“ \Z 2 \ +“*+|Znl >0 


SOLVED PROBLEMS 


Bilinear Forms 

12.1. Let ii = (x l ,x 2 ,x 3 ) and v = (y : , y 2 ■ >' 3 ) • Express / in matrix notation, where 

/(«, v) = 3-Xi.Vj - 2x 1 y 3 + 5x 2 y 1 + lx 2 y 2 ~ 8 x 2 y 3 + 4x 3 y 2 - 6x 3 y 3 

Let A = [ay], where a f j is the coefficient of xyyy Then 


f(u, V) = X t AY = [x u x 2 ,x 3 \ 

'3 0 -2' 

5 7-8 

>i 

72 


"so 1 

1 

^t 

O 

.73. 


12.2. Let A be an n x n matrix over K. Show that the mapping / defined by f(X, Y ) = X r A Y is a bilinear 
form on K”. 

For any a,b E K and any A ; , Y t G K", 

f{aX l + bX 2 , Y) = (aXj + bX 2 ) 1 AY = (aXf + bX 2 )AY 

= aXfAY + bX'AY = af(X u Y) + bf(X 2 , 7) 

Hence, / is lineal' in the first variable. Also, 

f(X, ciY x + bY 2 ) = X r A(aY x + bY 2 ) = aX r AY x + bX T AY 2 = af(X , 7,) + bf(X, 7 2 ) 


Hence, / is linear in the second variable, and so / is a bilinear form on K n . 





368 


CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms 


12.3. Let / be the bilinear form on R 2 defined by 

/[(* t - *2), (.Vi, T2)] = 2x iyi - 3 x^2 + 4x 2 y 2 


(a) Find the matrix A off in the basis {w, = (1,0), u 2 — (1,1)}. 

(b) Find the matrix B off in the basis {r/j = ( 2 , 1), v 2 = (1, — 1)}. 

(c) Find the change-of-basis matrix P from the basis {uf to the basis {u,}, and verify that 
B = P t AP. 

(a) Set A = [ajj], where a t] =f(u h iif. This yields 

an =/[(!,0), (1,0)] = 2 - 0 - 0 = 2, a 2I =/[( 1.1), (1,0)] = 2 - 0 + 0 = 2 

«12 =/[(l,0), (1,1)] = 2 — 3 — 0 = —1, a 22 =/[(l.l), (1,1)] = 2 — 3 + 4 = 3 


Thus, A = 


2 

2 


-1 

3 


is the matrix of / in the basis {u l , u 2 }. 


(b) Set B = [bf , where bjj = f( v h vf). This yields 


bn =/[(2,1), (2,1)] = 8-6 + 4 = 6, 
*i 2 =/[(2,1), (1,-1)] = 4 + 6-4 = 6, 


*2i=/[(l,-l), (2,1)] =4-3-4=-3 
*22 =/[(!,-1), (1,-1)] = 2 + 3 +4= 9 


Thus, B = 


6 6 
-3 9 


is the matrix of / in the basis {tq, v 2 }. 


and 


v 2 in terms of the u f 

yields v x - 

= u 1 + 22 2 and v 2 = 

: 2li ] 



' 1 

2 



'1 

r 



p = 

1 

-1 

? 

P T = 

2 

-1 


p t ap = 

'1 

r 

'2 

-1" 

'1 2' 


r 6 

6 

2 - 

-1 

2 

3 

1 -1 


-3 

9 


= B 


12.4. Prove Theorem 12.1: Let V be an 72 -dimensional vector space over K. Let { f ,...., (f > n } be any 
basis of the dual space V*. Then { f . : i,j = 1,... , 72 } is a basis of B(V), where f = is defined by 
fij(u, v ) = 4>i(u)4)j(v). Thus, dim B(V) = n 2 . 

Let {«!,..., 72„} be the basis of V dual to {<(>,•}. We first show that {ff} spans B(V). Let/ 6 B(V) and 
suppose f(Uj, Uj) = fly. We claim that/ = fly,// It suffices to show that 

f(u s , « f ) = (E a ijfij) (“v, u,) for s,t=l,..,,n 

We have 

(E a ijfij)( u s' u t) = E aijfj{u s ,u t ) = E a ij^ii u Mj{ u t) = >) ‘iifi.fi, = a st =f(u s ,u t ) 

as required. Hence, {/)} spans B( V). Next, suppose ff a^ = 0. Then for s, t = 1 

0 = 0(u s ,u t ) = (E aijfij) (u s ,u t ) = a rs 

The last step follows as above. Thus, {/)} is independent, and hence is a basis of B(V). 

12.5. Prove Theorem 12.2. Let P be the change-of-basis matrix from a basis 5 to a basis S'. Let A be the 
matrix representing a bilinear form in the basis S. Then B = P T AP is the matrix representing / in 
the basis S'. 

Let 72 , v G V. Because P is the change-of-basis matrix from S to S', we have P[u\ s , = [k] s and also 
P[v\ s , = [ 27 ]+ hence, [h]J = [u]^,P T . Thus, 

/(«, v ) = = W\ T S 'P T AP[v\ s , 

Because u and v are ai'bitrary elements of V, P T AP is the matrix of / in the basis S’. 
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Symmetric Bilinear Forms, Quadratic Forms 

12.6. Find the symmetric matrix that corresponds to each of the following quadratic forms: 

(a) q(x, y, z) = 3x 2 + 4xy - y 2 + 8xz - 6 yz + z 2 , 

(b) q'(x, y, z) =3x 2 +xz~ 2yz, (c) q"(x, y, z) = 2.x 2 - 5y 2 - lz 2 

The symmetric matrix A = [a t] \ that represents q(x x ,... ,x„) has the diagonal entry a u equal to the 
coefficient of the square term xj and the nondiagonal entries a (; - and each equal to half of the coefficient of 
the cross-product term xpCj. Thus, 



'3 2 4' 


■3 0 i- 


'2 0 O' 

II 

2-1 -3 

4 -3 1 

II 

0 0-1 
A - 1 o_ 

,(c) A" = 

0-5 0 

0 0-7 


The third matrix A" is diagonal, because the quadratic form q" is diagonal; that is, q" has no cross-product 
terms. 


12.7. Find the quadratic form q(X) that corresponds to each of the following symmetric matrices: 




A c n ~ 


'2 4-15' 

'5 -3' 


4 —J / 


4-7-68 

-3 8 _ 

, (b) 5- 

-5 -6 8 

7 O Q 

, (C) c = 

-1-6 3 9 



/ O —y 


5 8 9 1 


The quadratic form q(X) that corresponds to a symmetric matrix M is defined by q(X) = X J MX , 
where X = [jc f ] is the column vector of unknowns. 

(a) Compute as follows: 


q(x,y)=X T AX 



[5x - 3v, 


= 5x 2 — 3xy — 3xy + 8y 2 = 5x 2 — 6 xy + 8v 2 


—3x+ 8y] 


x 

.y. 


As expected, the coefficient 5 of the square term x 2 and the coefficient 8 of the square term y 2 are 
the diagonal elements of A, and the coefficient —6 of the cross-product term xy is the sum of 
the nondiagonal elements —3 and —3 of A (or twice the nondiagonal element —3, because A is 
symmetric). 

(b) Because B is a three-square matrix, there are three unknowns, say x,y,z or x 1 ,x 2 ,x 3 . Then 
q(x, y, z) = 4X 2 — lO.xy - 6y 2 + 14xz + 16yz — 9z 2 


or 


q(x t ,x 2 ,x 3 ) = 4x 2 — IOxjXj — 6 x 2 + 14x!X 3 + 16x 2 x 3 — 9x^ 


Here we use the fact that the coefficients of the square terms x 2 ,x?,x^ (or x 2 ,y 2 , z 2 ) are the respective 
diagonal elements 4, —6, —9 of B, and the coefficient of the cross-product term x,Xy is the sum of the 
nondiagonal elements by and bj, (or twice b because b t j = bjj). 

(c) Because C is a four-square matrix, there are four unknowns. Hence, 


^(x 1 ,x 2 ,x 3 ,x 4 ) = 2x\ — 7x? + 3 x 3 +X 4 + 8x 3 x 2 — 2xjX 3 
+ 10 x!x 4 — 12x 2 x 3 + 16x 2 x 4 + 18x 3 x 4 


12 . 8 . 


Let A = 


1 -3 

-3 7 

2 -5 


2 

-5 


8 


. Apply Algorithm 12.1 to find a nonsingular matrix P such that 


D = P t AP is diagonal, and find sig(A), the signature of A. 
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First form the block matrix M = [A,/]: 

'1-3 2 1 1 0 O' 

M = [A,7] = -3 7 -5 [ 0 1 0 

2-5 8 i 0 0 1_ 

Using a,, = 1 as a pivot, apply the row operations “Replace R 2 by 37?, + R 2 ” and “Replace 7? 3 by 
—27?, + 7? 3 " to M and then apply the corresponding column operations “Replace C 2 by 3C, + C 2 ” and 
"Replace C 3 by —2C, + C 3 ” to A to obtain 


'1 

-3 

2 ' 

1 

0 

0 " 


'1 

0 

0 1 

1 

0 

o' 

0 

-2 

i ; 

3 

1 

0 

and then 

0 

-2 

i; 

3 

1 

0 

0 

1 

4 i 

-2 

0 

1 


0 

1 

4 i 

-2 

0 

1 


Next apply the row operation “Replace 7? 3 by R 2 + 27? 3 ” and then the corresponding column operation 
"Replace C 3 by C 2 + 2C 3 ” to obtain 


"1 

0 

0 

1 

0 

O' 


1 

0 

0 

1 

0 

O' 

0 

-2 

1 

3 

1 

0 

and then 

0 

-2 

0 

3 

1 

0 

0 

0 

9 

-1 

1 

2 


0 

0 

18 

-1 

1 

2 


Now A has been diagonalized and the transpose of P is in the right half of M. Thus, set 


"1 

3 

-1" 

'1 

0 

o' 

0 

1 

1 and then 

D = P t AP = 0 

-2 

0 

0 

0 

2 

0 

0 

18 


Note D has p = 2 positive and n = 1 negative diagonal elements. Thus, the signature of A is 
sig(A) = p- n = 2- l = l. 

12.9. Justify Algorithm 12.1, which diagonalizes (under congruence) a symmetric matrix A. 

Consider the block matrix M = [A, 7], The algorithm applies a sequence of elementary row operations 
and the corresponding column operations to the left side of M, which is the matrix A. This is equivalent to 
premultiplying A by a sequence of elementary matrices, say, 7?,, 7? 2 ,..., E r , and postmultiplying A by the 
transposes of the 7?,. Thus, when the algorithm ends, the diagonal matrix D on the left side of M is equal to 

D = E r ■■■ E 2 E ] AE]E 2 ■ ■ ■ Ej = QAQ 1 , where Q = E r - ■ ■ E 2 E i 

On the other hand, the algorithm only applies the elementary row operations to the identity matrix 7 on the 
right side of M. Thus, when the algorithm ends, the matrix on the right side of M is equal to 

E T ---E 2 E l I = E T ---E 2 E l = Q 

Setting P = Q t , we get D = P T AP, which is a diagonalization of A under congmence. 

12.10. Prove Theorem 12.4: Let/ be a symmetric bilinear form on V over K (where 1 + 1^0). Then V 
has a basis in which f is represented by a diagonal matrix. 

Algorithm 12.1 shows that every symmetric matrix over K is congment to a diagonal matrix. This is 
equivalent to the statement that / has a diagonal representation. 

12.11. Let q be the quadratic form associated with the symmetric bilinear form f. Verify the polar 
identity/(w, v) = \[q(u + v) — q(u) — q(v)]. (Assume that 1 + 1^0.) 

We have 

q(u + v) — q(u ) — q{v) =f(u + v, u + v) —f(u , u) —f(v, v) 

= f(u,u) +f(u, v ) +f(v, u ) +f(v, v) -f(u,u) -f(v,v) = 2 f(u,v) 


If 1 + 1 ^ 0, we can divide by 2 to obtain the required identity. 
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12.12. Consider the quadratic form q(x,y ) = 3X 2 + 2xy — y 2 and the linear substitution 

x = s — 3t, y = 2s + t 

(a) Rewrite q(x,y ) in matrix notation, and find the matrix A representing q(x,y). 

(b) Rewrite the linear substitution using matrix notation, and find the matrix P corresponding to 
the substitution. 

(c) Find q(s, t) using direct substitution. 

(d) Find q(s, t) using matrix notation. 

(a) Here q(x,y) = [x,y] ^ j * . Thus, A = ^ j ; and q(X) = X T AX, where X = [jc,y] r . 

(b) Here * = \ ~ 3 , 5 . Thus, P= l ; and X = X ,Y= S and X = PY. 

y\ L 2 i J L r J L 2 1 J |yJ L f 

(c) Substitute for x and y in q to obtain 

q(s , ?) = 3(s — 3?) 2 + 2 (s — 3t)(2s + t) — (2s + t)~ 

= 3(s 2 — 6 st + 9t 2 ) + 2(2 j 2 — 5 st — 3 r) — (4s 2 + 4s? + ? 2 ) = 3s 2 — 32 st + 20r 2 

(d) Here q(X) = X r AX and X = PY. Thus, X T = Y T P T . Therefore, 

r r r i 2 i r3 ii ri 

q(s,t)=q(Y) = Y T P T APY = [s,t} ^ { ^ 2 

r , r 3 — i 6 i rsi , ~ 

= s, ? = 3s 2 — 32s? + 20? 2 

1 J L —16 20J [t_ 

[As expected, the results in parts (c) and (d) are equal.] 

12.13. Consider any diagonal matrix A = diagfa,...., a n ) over K. Show that for any nonzero scalars 
k l ,...,k n £ K,A is congruent to a diagonal matrix D with diagonal entries a x k\,... ,a„k%. 
Furthermore, show that 

(a) If K = C, then we can choose D so that its diagonal entries are only l’s and 0’s. 

(b) lfK = R, then we can choose D so that its diagonal entries are only l's, — l’s, and 0’s. 

Let P = diag(/q,..., k n ). Then, as required, 

D = P t AP = diag(Ay) diag(a,) diag(A') = diag^Af,..., a„k ;) 

(a) Let P = diag(fc ), where b t = i a < ^ jj 

(1 if cij = 0 

Then P'A P has the required form. 

(b) Let P = diag(£> ; ), where b t = 

Then P r A P has the required form. 

Remark: We emphasize that (b) is no longer true if “congruence” is replaced by “Hermitian 
congruence.” 

12.14. Prove Theorem 12.5: Let/ be a symmetric bilinear form on V over R. Then there exists a basis of 
V in which / is represented by a diagonal matrix. Every other diagonal matrix representation of/ 
has the same number p of positive entries and the same number n of negative entries. 

By Theorem 12.4, there is a basis {iq,..., u n } of V in which/ is represented by a diagonal matrix with, 
say, p positive and n negative entries. Now suppose {wq,... ,w n } is another basis of V, in which / is 
represented by a diagonal matrix with p' positive and n' negative entries. We can assume without loss of 
generality that the positive entries in each matrix appear first. Because rank(/) =p + n = p' + n', it 
suffices to prove that p = p'. 


I a,-1 if / 0 

if Cl: = 0 
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Let U be the linear span of u j,..., u p and let W be the linear span of w p , + 1 ,..,, w n . Then/(u, v) > 0 
for every nonzero v G U, and f(v, v) < 0 for every nonzero v G W. Hence, U fl W = {0}. Note that 
dim U = p and dim W = n — p'. Thus, 

dim(£/ + W) = dim U + dimIT — dim(I7 fl W) = p + (w — p') — 0 = p — p' + n 

But dim([/ + W) < dim V = iv, hence, p — p ; + n < // or p < p. Similarly, p p and therefore p = p . as 
required. 

Remark: The above theorem and proof depend only on the concept of positivity. Thus, the 
theorem is true for any subfield K of the real field R such as the rational field Q. 

Positive Definite Real Quadratic Forms 

12.15. Prove that the following definitions of a positive definite quadratic form q are equivalent: 

(a) The diagonal entries are all positive in any diagonal representation of q. 

(b) q(Y) > 0, for any nonzero vector Y in R' ! . 

Suppose q(Y) = a^y\ + ajy\ + • • • + a n y\. If all the coefficients are positive, then clearly q(Y) > 0 
whenever Y ^ 0. Thus, (a) implies (b). Conversely, suppose (a) is not true; that is, suppose some diagonal 
entry a k < 0. Let e k = (0,..., 1,... 0) be the vector whose entries are all 0 except 1 in the £th position. 
Then q(e k ) = a k is not positive, and so (b) is not true. That is, (b) implies (a). Accordingly, (a) and (b) are 
equivalent. 

12.16. Determine whether each of the following quadratic forms q is positive definite: 

(a) q(x, y, z) — x 2 + 2y 2 - 4xz - 4yz + lz 2 

(b) q(x, y, z )=x 2 +y 2 + 2xz + 4yz + 3z 2 

Diagonalize (under congruence) the symmetric matrix A corresponding to q. 

(a) Apply the operations “Replace /f, by 2 R x + R 3 ” and “Replace C 3 by 2C t + C 3 ,” and then “Replace R 3 
by R 2 + R 3 ” and “Replace C 3 by C 2 + C 3 ." These yield 

"i o- 2 i n o oi r i o o' 

A= 0 2-2-0 2 -2 -020 

-2 -2 7j [° -2 3j L° 0 !. 

The diagonal representation of q only contains positive entries, 1,2,1, on the diagonal. Thus, q is 
positive definite. 

(b) We have 

'1 0 ll [10 ol [1 0 o' 

A= 012-012-01 0 

1 2 3j L° 2 2 J L° 0 -2 

There is a negative entry —2 on the diagonal representation of q. Thus, q is not positive definite. 

12.17. Show that q(x,y) = ax 2 + bxy + cy 2 is positive definite if and only if a > 0 and the discriminant 
D = b 2 — 4ac < 0. 

Suppose v = (jc, y) ^ 0. Then either x 0 or y ^ 0; say, y ^ 0. Let t = x/y. Then 
q(v) = y 2 [a(jc/y) 2 + b(x/y) + c] = y 2 (at 2 + bt + c) 

However, the following are equivalent: 

(i) s = at 2 + bt + c is positive for every value of t. 

(ii) s = at 2 + bt + c lies above the r-axis. 

(iii) a > 0 and D = b 1 — 4ac < 0. 
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Thus, q is positive definite if and only if a > 0 and D < 0. [ Remark: D < 0 is the same as det(A) > 0, where 
A is the symmetric matrix corresponding to q .] 

12.18. Determine whether or not each of the following quadratic forms q is positive definite: 

(a) q(x, y) =x 2 — 4xy + ly 2 , (b) q(x, y) = x 2 + 8xy + 5y 2 , (c) q(x, y) = 3x 2 + 2 xy + y 2 
Compute the discriminant D = b 2 — 4 ac, and then use Problem 12.17. 

(a) D = 16 — 28 = —12. Because a = 1 > 0 and D < 0, q is positive definite. 

(b) D = 64 — 20 = 44. Because D > 0, q is not positive definite. 

(c) D = 4 — 12 = —8. Because a = 3 > 0 and D < 0, q is positive definite. 

Hermitian Forms 

12.19. Determine whether the following matrices are Hermitian: 



2 2 + 3/ 4 — 5/“ 


3 2 — / 4 + / 


'4-3 5' 

(a) 

2 — 3 i 5 6 + 2/ 

4 + 5/ 6 - 2/ -7 

> (b) 

2 — / 6 / 

4 + / / 7 

,(c) 

-3 2 1 

5 1 -6 


A complex matrix A = [a^] is Hermitian if A* = A —that is, if a t] = « /( -. 

(a) Yes, because it is equal to its conjugate transpose. 

(b) No, even though it is symmetric. 

(c) Yes. In fact, a real matrix is Hermitian if and only if it is symmetric. 

12.20. Let A be a Hermitian matrix. Show that / is a Hermitian form on C" where / is defined by 
f(X, Y) = X t AY. 

For all a,b G C and all X 1 ,X 2 , Y £ C'\ 

f(aX x + bX 2 , Y) = (aX i + bX 2 fAY = ( aX T x + bX^)AY 

= aXjAY + bXlAY = af(X x , Y) + bf(X 2 . Y) 

Hence, / is linear in the first variable. Also, 

f(X, Y) = X T AY = (. X t AY) t = Y T A T X = Y T A*X = Y T AX =f(Y,X) 

Hence, / is a Hermitian form on C". 

Remark: We use the fact that X T AY is a scalar and so it is equal to its transpose. 

12.21. Let / be a Hermitian form on V. Let H be the matrix of / in a basis S — {», } of V. Prove the 
following: 

(a) f(u, v) — [u\lH[v] s for all u, v € V. 

(b) If P is the change-of-basis matrix from 5 to a new basis S' of V. then B = P T HP (or 
B = Q*HQ, where Q = P) is the matrix of / in the new basis S'. 

Note that (b) is the complex analog of Theorem 12.2. 

(a) Let u, v £ V and suppose u = upq + • • • + a n ii n and v = b l u 1 + • • • + b n ii n . Then, as required, 

/(«, v) = f(aiu x + • • • + fcjiq + • • • + b n ii n ) 

= E a ib/(u h Vj) = a n ]H[b x , • • •, bf = [u] T s H[v\ s 

ij 
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(b) Because P is the change-of-basis matrix from S to S', we have P[«]y = [w] s and P[v] 5 , = [v] s ; hence, 
[uY s = [«]yP r and [d] s = P[tt] s ,. Thus, by (a), 

/(«: v ) = [u] T s H[v] s = [u] T s ,P t HP[v] s , 

But u and v are arbitrary elements of V; hence, P t HP is the matrix off in the basis S'. 


12.22. Let H = 


1 1 + i 2 i 

l-i 4 2-3 i 

—2/ 2 + 3/ 7 


a Hermitian matrix. 


Find a nonsingular matrix P such that D = P T HP is diagonal. Also, find the signature of H. 

Use the modified Algorithm 12.1 that applies the same row operations but the corresponding conjugate 
column operations. Thus, first form the block matrix M = [H,I]: 


M = 


1 

1 - / 
-2 i 


1 + i 
4 

2 + 3/ 


2 / 10 0 
2-3/ 0 1 0 
7 0 0 1 


Apply the row operations “Replace R 2 by (—1 + i)R l + P 2 ” an d “Replace R 3 by 2 iR x + R 3 ” and then the 
corresponding conjugate column operations “Replace C 2 by (—1 — i)C l + C 2 ” and "Replace C 3 by 
—2/C! + C 3 ” to obtain 


'1 

1 + 1 

2/ 

1 

0 

0" 


'1 

0 

0 

1 

0 

0" 

0 

2 

—5/ 

— 1 + / 

1 

0 

and then 

0 

2 

-5/ 

—1 + / 

1 

0 

0 

5/ 

3 

2/ 

0 

1 


0 

5/ 

3 

2/ 

0 

1 


Next apply the row operation “Replace R 3 by —5 iR 2 + 2R 3 ” and the corresponding conjugate column 
operation “Replace C 3 by 5 iC 2 + 2 C 3 ” to obtain 


"1 

0 

0 

1 

0 

O' 


'1 

0 

0 

1 

0 

0" 

0 

2 

-5/ 

— 1 + / 

1 

0 

and then 

0 

2 

0 

— 1 + / 

1 

0 

0 

0 

-19 

5 + 9/ 

-5/ 

2 


0 

0 

-38 

5 + 9/ 

-5/ 

2 


Now H has been diagonalized, and the transpose of the right half of M is P. Thus, set 



"1 

— 1 + i 

5 + 9/ 


'1 

0 

0" 

P = 

0 

1 

-5/ 

, and then D = P T HP = 

0 

2 

0 


0 

0 

2 


0 

0 

-38 


Note D has p = 2 positive elements and n = 1 negative elements. Thus, the signature of H is 
sig (H) = 2-1 = 1. 


Miscellaneous Problems 


12.23. Prove Theorem 12.3: Let/ be an alternating form on V. Then there exists a basis of V in which/ is 
represented by a block diagonal matrix M with blocks of the form ^ ^ 


or 0. The number 


of nonzero blocks is uniquely determined by / [because it is equal to \ rank(/)]. 


If/ = 0, then the theorem is obviously true. Also, if dim V = 1, then/f+p/, k 2 u) = u) = 0 and 

so / = 0. Accordingly, we can assume that dim V > 1 and / / 0. 

Because / / 0, there exist (nonzero) u t ,u 2 G V such that f(u 1 ,u 2 ) / 0. In fact, multiplying u i by 
an appropriate factor, we can assume that f(u l ,u 2 ) = 1 and so/(m 2 ,;<i) = —1. Now u j and u 2 are 
linearly independent; because if, say, u 2 = ku\, then f(u l ,u 2 )=f(u l ,ku l )=kf(u l ,u l ) = 0. Let 
U = span(«!,« 2 ); then. 
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(i) The matrix representation of the restriction of / to U in the basis {u u u 2 } is ^ ^ , 

(ii) If u G U, say u = au l + bu 2 , then 

f(u,ui)=f(aii\+bu 2 , Mi) = — b and f(u,u 2 ) =f(au\ + bu 2 , u 2 ) = a 

Let W consists of those vectors w G V such that /(w, iq) = 0 and f(w,u 2 ) = 0. Equivalently, 

W = {w G V : f(w , u) = 0 for every u G U} 

We claim that V = U © W. It is clear that U fl W = {0}, and so it remains to show that V = U + W. Let 
v G V. Set 

u =f(v, u 2 )u l — f(v, u { )u 2 and w= v — u (1) 

Because u is a linear combination of u 1 and u 2 , «GK 

We show next that w G W. By (1) and (ii), f(u, u { ) =f(v,u l ); hence, 

/(w,M t ) =f(v-u , Ui) =f(v,u 1 ) -/(«,«!) = 0 

Similarly, f(u, u 2 ) = f(v,u 2 ) and so 

f(w,u 2 ) +f(v - u, u 2 ) =f(v,u 2 ) -f(u,u 2 ) = 0 

Then w G W and so, by (1), v = u + w, where uGffi This shows that V = U + W; therefore, V = U © W. 

Now the restriction of / to W is an alternating bilinear form on W. By induction, there exists a basis 
m 3 ,..., u„ of W in which the matrix representing / restricted to W has the desired form. Accordingly, 
u ] ; t< 2 , m 3 , •. •, w„ is a basis of V in which the matrix representing / has the desired form. 


SUPPLEMENTARY PROBLEMS 


Bilinear Forms 

12.24. Let u = (x { ,x 2 ) and v = (yi,y 2 )- Determine which of the following are bilinear forms on R 2 : 

(a) f(u,v) = 2x^2-3x 2 y x , (c) f(u,v)=3x 2 y 2 , (e) f(u,v) = l, 

(b) f(u,v) =x l +y 2 , (d) f(u,v)=x l x 2 +y l y 2 , (f) f(u,v) = 0 


12.25. Let/ be the bilinear form on R 2 defined by 

/[(■* i,x 2 ), (yi,y 2 )] = ^x l y l - 2x x y 2 + Ax 2 y x - x 2 y 2 

(a) Find the matrix A off in the basis {u x = (1,1), u 2 = (1,2)}. 

(b) Find the matrix B off in the basis {rq = (1, —1), v 2 = (3, 1)}. 

(c) Find the change-of-basis matrix P from {m ; } to {t/}, and verify that B = P T AP. 


12.26. Let V be the vector space of two-square matrices over R. Let M = 


1 2 

3 5 


and let/(A,B) = tr (A r MB), 


where A, B G V and “tr” denotes trace, (a) Show that/ is a bilinear form on V. (b) Find the matrix off in the 


basis 


12.27. Let B(V) be the set of bilinear forms on V over K. Prove the following: 

(a) If/,g G B(V), then/ + g, kg G B(V) for any k e K. 

(b) If f and a are linear functions on V, then f(u, v) = 4>(u)o(v) belongs to B(V). 


1 0 


0 1 


O 

o 


o 

O 

0 0 

5 

0 0 

5 

1 0 

? 

0 1 


12.28. Let [/] denote the matrix representation of a bilinear form / on V relative to a basis {«,}. Show that the 
mapping / i—> [/] is an isomorphism of B( V) onto the vector space V of n-square matrices. 
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12.29. Let / be a bilinear form on V. For any subset S of V. let 

S 1 = {v G V: f(u, v) = 0 for every u G 5} and S T = {v € V : /(v, u) = 0 for every u G S'} 

Show that: (a) S T and S T are subspaces of V; (b) C S 2 implies S 2 C 5] 1 and Sj C Sj ; 
(c) {0}^ = {0} T = K 

12.30. Suppose f is a bilinear form on V. Prove that: rank! f) = dim V — dim V L = dim V — dim V T , and hence, 
dim = dim V T . 

12.31. Let / be a bilinear form on V. For each u G V, let u:V —* K and u:V —> K be defined by u(x) =f(x,u) and 
ii(x) = f{u,x). Prove the following: 

(a) u and u are each linear; i.e., u,u G V *, 

(b) u i—> u and u > u are each linear mappings from V into V*, 

(c) rank(/) = rank(w i—> u) = rank(w u). 

12.32. Show that congruence of matrices (denoted by c±) is an equivalence relation; that is, 

(i) A ^ A: (ii) If A ~ B, then B ~ A; (iii) If /A ~ B and B ~ C. then A ~ C. 

Symmetric Bilinear Forms, Quadratic Forms 

12.33. Find the symmetric matrix A belonging to each of the following quadratic forms: 

(a) q(x, y, z) - 2x 2 - 8^y + y 2 - 16xz + 14 yz + 5 z 2 , (c) q(x, y, z) = xy + y 2 + 4xz + z 2 

(b) q(x,y,z) = x 2 -xz + y 2 , (d) q(x,y,z) = xy + yz 

12.34. For each of the following symmetric matrices A, find a nonsingular matrix P such that D = P T AP is 
diagonal: 


II 

10 2' 
0 3 6 

II 

2 

'i-2 r 

-2 5 3 

.(c) A = 

”1-10 2' 
-12 10 

0 112 


2 6 7 


1 3 -2 


2 0 2 -1 


12.35. Let q(x,y) = TjC — 6xy — 3v 2 and x = s + 2t, y = 3s — t. 

(a) Rewrite q(x, y) in matrix notation, and find the matrix A representing the quadratic form. 

(b) Rewrite the linear substitution using matrix notation, and find the matrix P corresponding to the 
substitution. 

(c) Find q(s, t) using (i) direct substitution, (ii) matrix notation. 

12.36. For each of the following quadratic forms q(x,y,z), find a nonsingular linear substitution expressing the 
variables x, y. z in terms of variables r, s, t such that q(r , s, t) is diagonal: 

(a) q(x, y, z) = x 2 + 6xv + 8y 2 - 4xz + 2 yz - 9z 2 , 

(b) q(x, y, z) = 2x 2 - 3y 2 + 8xz + 12yz + 25z 2 , 

(c) q(x, y, z) = x 2 + 2xy + 3y 2 + 4xz + 8yz + 6z 2 . 

In each case, find the rank and signature. 

12.37. Give an example of a quadratic form q(x,y) such that q(u) = 0 and q(v) = 0 but q(u + v) ^ 0. 

12.38. Let S(V) denote all symmetric bilinear forms on V. Show that 

(a) S(V) is a subspace of B(V); (b) If dim V = n, then dim S(V) = \n{n + 1). 

12.39. Consider a real quadratic polynomial q(xi,... ,x„) = Ymj= i a ij x i x j> where a y = a^. 
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(a) If flu 7 ^ 0, show that the substitution 

Xi=yi-~{a l2 y 2 + --- + a ln y„), x 2 = y 2 , x„ = y„ 

a u 

yields the equation q(x j,... ,x„) = a n y\ + q'(y 2 , ■ ■ ■ ,y„), where q 1 is also a quadratic polynomial. 

(b) If a u =0 but, say, a 12 / 0, show that the substitution 

-*i = y\ + v 2 , * 2 =yi~ yu x 3 =y3, ■■■■> x „ = y n 

yields the equation q(x j,... ,x n ) = by y,- y } , where b u f 0, which reduces this case to case (a). 
Remark: This method of diagonalizing q is known as completing the square. 


Positive Definite Quadratic Forms 

12.40. Determine whether or not each of the following quadratic forms is positive definite: 

(a) q(x, y) = 4r + 5xy + Ty 2 , (c) q(x, y, z ) = .r 2 + 4xy + 5y 2 + 6xz + 2yz + 4 r 

(b) q(x, y) = 2x 1 -3xy-y 2 , (d) q(x, y, z) = x 2 + 2xy + 2 y 2 + 4xz + 6 yz + lz 2 

12.41. Find those values of k such that the given quadratic form is positive definite: 

(a) q{x,y) = 2 jT — 5xy + ky 2 , (b) q(x,y) = 3X 2 — kxy + 12y 2 

(c) q(x, y, z) = x 2 + 2xy + 2y 2 + 2 xz + 6 yz + kz 2 

12.42. Suppose A is a real symmetric positive definite matrix. Show that A = P T P for some nonsingular matrix P. 


Hermitian Forms 


12.43. Modify Algorithm 12.1 so that, for a given Hermitian matrix H , it finds a nonsingular matrix P for which 
D = P t AP is diagonal. 

12.44. For each Hermitian matrix H, find a nonsingular matrix P such that D = P t HP is diagonal: 


II 

'cd' 

1 i 

-i 2 

, (b) H = 

1 2 + 3/' 

2-3 i -1 

(c) H = 

1 - 

+ 1 ■ 
<N 

<N 

i_ 






12—/ 1+t 2 | 


Find the rank and signature in each case. 


12.45. Let A be a complex nonsingular matrix. Show that H = A*A is Hermitian and positive definite. 

12.46. We say that B is Hermitian congruent to A if there exists a nonsingular matrix P such that B = P T AP or, 
equivalently, if there exists a nonsingular matrix Q such that B = Q*AQ. Show that Hermitian congruence is 
an equivalence relation. {Note: If P = Q, then P T AP = Q*AQ.) 


12.47. Prove Theorem 12.7: Let/ be a Hemiitian form on V. Then there is a basis S of V in which/ is represented 
by a diagonal matrix, and every such diagonal representation has the same number p of positive entries and 
the same number n of negative entries. 


Miscellaneous Problems 

12.48. Let e denote an elementary row operation, and let/* denote the corresponding conjugate column operation 
(where each scalar k in e is replaced by k in/*). Show that the elementary matrix corresponding to/* is the 
conjugate transpose of the elementary matrix corresponding to e. 

12.49. Let V and W be vector spaces over K. A mapping f:V x W —> K is called a bilinear form on V and W if 

(i) f(av l +bv 2 , w) = af(v u w) + bf(v 2 ,w), 

(ii) f(v, aw x +bw 2 )=af(v,w l )+bf(v,w 1 ) 

for every a,b E K, v t 6 V, Wj 6 W. Prove the following: 
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(a) The set B{ V, W) of bilinear forms on V and W is a subspaee of the vector space of functions from 
VxW into K. 

(b) If is a basis of V* and {u l , ..., ct n } is a basis of W*, then 

{fj : i = 1,... ,m,j = 1,...,«} is a basis of B(V , W), wher e fj is defined by fj(v, w) = 4>i(v)(Tj(w). 
Thus, dim B(V , W) = dim V dim W. 

[Note that if V = W, then we obtain the space B(V) investigated in this chapter.] 

m times 

,-*-V 

12.50. Let V be a vector space over K. A mapping/:!/ xVx...xV^Slis called a multilinear (or m-linear) 
form on V if / is linear in each variable; that is, for i = 1,..., m, 

/(..., au + bv, ...)=qf(...,u,...)+bf 

where f?. denotes the ith element, and other elements are held fixed. An m-linear form/ is said to be 
alternating if/(vj,... v m ) = 0 whenever v t = Vj for i f j. Prove the following: 

(a) The set B m (V) of m-linear forms on V is a subspace of the vector space of functions from 
V x V x • • • x V into K. 

(b) The set A m (V) of alternating ;?(-linear forms on V is a subspace of B m (V). 

Remark 1: If m = 2, then we obtain the space B( V) investigated in this chapter. 

Remark 2: If V = K"\ then the determinant function is an alternating m-linear form on V. 


ANSWERS TO SUPPLEMENTARY PROBLEMS 


Notation: A/ = [//; R 2 ; ...] denotes a matrix M with rows R i , R 2 ,.... 

12.24. (a) yes, (b) no, (c) yes, (d) no, (e) no, (f) yes 

12.25. (a) A =[4,1; 7,3], (b) B=[ 0,-4; 20,32], (c) P=[3,5; -2,-2] 

12.26. (b) [1,0,2,0; 0,1,0,2; 3,0,5,0; 0,3,0,5] 

12.33. (a) [2,-4,- 8 ; -4,1,7; -8,7,5], (b) [1,0, -i; 0.1,0; -1,0,0], 

(c) [ 0 ,i, 2 ; 5 , 1 , 0 ; 2 , 0 . 1 ], (d) [ 0 , 1 , 0 ; 1 , 0 , 1 ; 1 , 0 , 1 ; 0 , 1 , 0 ] 

12.34. (a) P = [1,0, —2; 0,1,-2; 0,0,1], D = diag(l, 3, -9); 

(b) P = [1,2, —11; 0,1,-5; 0,0,1], D = diag(l. 1, -28); 

(c) P=[ 1,1,—1,-4; 0,1,-1,-2; 0,0,1,0; 0,0,0,1], D = diag(l, 1,0,-9) 

12.35. A =[2,-3; -3, -3], P = [1,2; 3,-1], q(s, t) = -43s 2 - 4st + 17f 2 

12.36. (a) x = r — 3s — I9t, y = s + It, z= t\ q(r, s, t) = r 2 — s 2 + 36 1 2 \ 

(b) x=r— 2f, y = s + 2t, z = t; q(r, s, t) = 2r 2 — 3s 2 + 29? 2 ; 

(c) x = r — s — t, y = s — t, z = t; q(r, s, t) = r 2 — 2s 2 

12.37. q(x,y) = x 1 —y 2 , u = (1,1), v = (1,-1) 

12.40. (a) yes, (b) no, (c) no, (d) yes 

12.41. (a) k> f, (b) -12 < k < 12, (c) k> 5 


12.44. (a) P=[l,i; 0,1], D = I,s = 2; (b) P=[l,-2 + 3i; 0,1], D = diag(l,-14), s = 0; 

(c) P = [l,i, —3 + i; 0,1,f; 0,0,1], D = diag(l, 1, -4 ),s = 1 





CHAPTER 13 



Linear Operators on Inner 

Product Spaces 


13.1 Introduction 


This chapter investigates the space A(V) of linear operators T on an inner product space V. (See Chapter 7.) 
Thus, the base field K is either the real numbers R or the complex numbers C. In fact, different 
terminologies will be used for the real case and the complex case. We also use the fact that the inner 
products on real Euclidean space R" and complex Euclidean space C" may be defined, respectively, by 

(it, v ) = u T v and (u, v) — u T v 

where u and v are column vectors. 

The reader should review the material in Chapter 7 and be very familiar with the notions of norm 
(length), orthogonality, and orthonormal bases. We also note that Chapter 7 mainly dealt with real inner 
product spaces, whereas here we assume that V is a complex inner product space unless otherwise stated or 
implied. 

Lastly, we note that in Chapter 2, we used A H to denote the conjugate transpose of a complex matrix A; 
that is, A H — A T . This notation is not standard. Many texts, expecially advanced texts, use A* to denote 
such a matrix; we will use that notation in this chapter. That is, now A* = A T . 


13.2 Adjoint Operators 

We begin with the following basic definition. 

DEFINITION: A linear operator T on an inner product space V is said to have an adjoint operator T* 

on V if ( T(u ), v) = (u, T*(v)) for every u, v € V. 

The following example shows that the adjoint operator has a simple description within the context of 
matrix mappings. 

EXAMPLE 13.1 

(a) Let A be a real n-square matrix viewed as a linear operator on R". Then, for every u, v E R„, 

(Am, v) = (Ait) 7 v = u t A t v — (u,A r v) 

Thus, the transpose A T of A is the adjoint of A. 

(b) Let B be a complex n-square matrix viewed as a linear operator on C'\ Then for every m, v , 6 C", 

(. Bu , v) — (Bu) 7 v — u t B t v = it 7 B*v — (u,B*v) 

Thus, the conjugate transpose B* of B is the adjoint of B. 


379 




380 


CHAPTER 13 Linear Operators on Inner Product Spaces 


Remark: B* may mean either the adjoint of B as a linear operator or the conjugate transpose of B as 
a matrix. By Example 13.1(b), the ambiguity makes no difference, because they denote the same object. 

The following theorem (proved in Problem 13.4) is the main result in this section. 

THEOREM 13.1: Let T be a linear operator on a finite-dimensional inner product space V over K. 
Then 

(i) There exists a unique linear operator T* on V such that (T(u),v) — (u,T*(v)) 
for every u. v G V. (That is, T has an adjoint T*.) 

(ii) If A is the matrix representation T with respect to any orthonormal basis 
S — {uj} of V, then the matrix representation of T* in the basis S is the 
conjugate transpose A* of A (or the transpose A 1 of A when K is real). 

We emphasize that no such simple relationship exists between the matrices representing T and T* if the 
basis is not orthonormal. Thus, we see one useful property of orthonormal bases. We also emphasize that 
this theorem is not valid if V has infinite dimension (Problem 13.31). 

The following theorem (proved in Problem 13.5) summarizes some of the properties of the adjoint. 

THEOREM 13.2: Let T, 7), T 2 be linear operators on V and let k G K. Then 

(i) {T l + T 1 )* = Tf + n, (iii) (TJ 2 )* = T$Tf, 

(ii) (kT)* = kT*, (iv) (T*)* = T. 

Observe the similarity between the above theorem and Theorem 2.3 on properties of the transpose 
operation on matrices. 

Linear Functionals and Inner Product Spaces 

Recall (Chapter 11) that a linear functional ^ on a vector space V is a linear mapping (j>: V —> K. This 
subsection contains an important result (Theorem 13.3) that is used in the proof of the above basic 
Theorem 13.1. 

Let V be an inner product space. Each u € V determines a mapping u:V —* K defined by 

u(v) = (v,u) 

Now, for any a,b £ K and any v ,. v 2 G V, 

u[av i + bv 2 ) = (avi + bv 2 , u) = a(v l ,u) + b(v 2 , u) = ai^v^) + bii{v 2 ) 

That is, n is a linear functional on V. The converse is also true for spaces of finite dimension and it is 
contained in the following important theorem (proved in Problem 13.3). 

THEOREM 13.3: Let </> be a linear functional on a finite-dimensional inner product space V. Then there 
exists a unique vector u G V such that (f> (v) = (v, u) for every v G V. 

We remark that the above theorem is not valid for spaces of infinite dimension (Problem 13.24). 


13.3 Analogy Between A{V) and C, Special Linear Operators 


Let A(V) denote the algebra of all linear operators on a finite-dimensional inner product space V. The 
adjoint mapping T i—> T* on A (V) is quite analogous to the conjugation mapping z >—> z on the complex 
field C. To illustrate this analogy we identify in Table 13-1 certain classes of operators T G A(V) whose 
behavior under the adjoint map imitates the behavior under conjugation of familiar classes of complex 
numbers. 

The analogy between these operators T and complex numbers z is reflected in the next theorem. 
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Table 13-1 


Class of complex 
numbers 

Behavior under 
conjugation 

Class of operators in A(V ) 

Behavior under the 
adjoint map 

Unit circle (|z| = 1) 

z = l/z 

Orthogonal operators (real case) 
Unitary operators (complex case) 

* 

II 

i 

Real axis 

z — z 

Self-adjoint operators 

Also called: 

symmetric (real case) 

Hermitian (complex case) 

II 

* 

E**h 

Imaginary axis 

z — —z 

Skew-adjoint operators 

Also called: 

skew-symmetric (real case) 
skew-Hermitian (complex case) 

l "J 

* 

II 

1 

Positive real axis 

(0, oo) 

z — ww, w/0 

Positive definite operators 

T = S*S 

with S nonsingular 


THEOREM 13.4: Let X be an eigenvalue of a linear operator T on V. 

(i) If T* = T~ l (i.e., T is orthogonal or unitary), then A = 1. 

(ii) If T* = T (i.e., T is self-adjoint), then /. is real. 

(iii) If T* = —T (i.e., T is skew-adjoint), then X is pure imaginary. 

(iv) If 7’ = .S' 1 '.S' with S nonsingular (i.e., T is positive definite), then X is real and 
positive. 

Proof. In each case let v be a nonzero eigenvector of T belonging to X\ that is, T(v) = Xv with 
v fO. Hence, ( v, v) is positive. 

Proof of (i). We show that XX(v, v) = {v, v): 

XX{v,v) = {Xv, Xv) = (T(v),T{v)) = {v, T*T(v)) = {v,I(v)) = (v,v) 

But ( v, v) 7 ^ 0; hence, XX = 1 and so |A| = 1. 

Proof of (ii). We show that X(v, v) = X{v, v): 

X(v, v) = {Xv, v) = {T{v), v) = {v, T*{v)) = {v, T(v)) = {v, Xv) = X{v, v) 

But {v, v) f 0; hence, X — X and so X is real. 

Proof of (iii). We show that X{v, v) = —X{v, v): 

X{v, v) = {Xv, v) = {T{v), v) — {v, T*(v)) = {v, —T{v)) = {v, —Xv) = —X{v, v) 

But ( v, v) f 0; hence, X — — X or X — —X, and so X is pure imaginary. 

Proof of (iv). Note first that S(v) f 0 because S is nonsingular; hence, (S(v ), S( v)) is positive. We 
show that X{v, v) — {S{v),S{v))\ 

X{v, v) = {Xv, v) = {T{v),v) = {S*S{v),v) = {S{v),S{v)) 


But {v, v) and {S(v),S(v)) are positive; hence, X is positive. 
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Remark: Each of the above operators T commutes with its adjoint; that is, TT* = T*T. Such 
operators are called normal operators. 


13.4 Self-Adjoint Operators 


Let T be a self-adjoint operator on an inner product space V: that is, suppose 
J 1 * — T 

(If T is defined by a matrix A, then A is symmetric or Hermitian according as A is real or complex.) By 
Theorem 13.4, the eigenvalues of T are real. The following is another important property of T. 

THEOREM 13.5: Let T be a self-adjoint operator on V. Suppose it and v are eigenvectors of T 
belonging to distinct eigenvalues. Then u and v are orthogonal; that is, ( u , v) = 0. 

Proof. Suppose T(u) = Xj u and T(v) = X 2 v, where ). 2 . We show that 2j(w, v) — X 2 (u, v): 

Xi(u,v) — (A l u,v) = ( T(u),v) = (u,T*(v)) = (u, T(v)) 

= (u,X 2 v) = X 2 (u, v) = X 2 (u, v) 

(The fourth equality uses the fact that T* = T, and the last equality uses the fact that the eigenvalue X 2 is 
real.) Because X x f X 2 , we get (u, v) = 0. Thus, the theorem is proved. 


13.5 Orthogonal and Unitary Operators 

Let U be a linear operator on a finite-dimensional inner product space V. Suppose 

U* — U~ l or equivalently UU* = U*U = I 

Recall that U is said to be orthogonal or unitary according as the underlying field is real or complex. The 
next theorem (proved in Problem 13.10) gives alternative characterizations of these operators. 

THEOREM 13.6: The following conditions on an operator U are equivalent: 

(i) U* = U~ l ; that is, UU* = U*U = I. [U is unitary (orthogonal).] 

(ii) U preserves inner products; that is, for every v. w G V. 

(U(v), U(w)} — (v,w). 

(iii) U preserves lengths; that is, for every v £ V, ||(/(u)|| = ||u||. 

EXAMPLE 13.2 

(a) Let T :R 3 -*■ R 3 be the linear operator that rotates each vector v about the c-axis by a fixed angle 9 as shown in 
Fig. 10-1 (Section 10.3). That is, T is defined by 

T (x, y, z ) = (xcos 6 — y sin 6, x sin 0 + y cos 9, z) 

We note that lengths (distances from the origin) are preserved under T. Thus, T is an orthogonal operator. 

(b) Let V be / 2 -space (Hilbert space), defined in Section 7.3. Let T:V —* V be the linear operator defined by 

T(a l ,a 2 ,a 3 ,...) = (0,a u a 2 ,a 3 ,...) 

Clearly, T preserves inner products and lengths. However, T is not surjective, because, for example, (1,0,0,...) 
does not belong to the image of T ; hence, T is not invertible. Thus, we see that Theorem 13.6 is not valid for 
spaces of infinite dimension. 

An isomorphism from one inner product space into another is a bijective mapping that preserves the 
three basic operations of an inner product space: vector addition, scalar multiplication, and inner products. 
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Thus, the above mappings (orthogonal and unitary) may also be characterized as the isomorphisms of V 
into itself. Note that such a mapping U also preserves distances, because 

|j U (u) —■ U{w)\\ = || U(v — w)|| = j| u — w|| 

Hence, U is called an isometry. 

13.6 Orthogonal and Unitary Matrices 


Let U be a linear operator on an inner product space V. By Theorem 13.1, we obtain the following results. 

THEOREM 13.7A: A complex matrix A represents a unitary operator U (relative to an orthonormal 
basis) if and only if A* — A -1 ; 

THEOREM 13.7B: A real matrix A represents an orthogonal operator U (relative to an orthonormal 
basis) if and only if A T = A -1 . 

The above theorems motivate the following definitions (which appeared in Sections 2.10 and 2.11). 
DEFINITION: A complex matrix A for which A* = A -1 is called a unitary matrix. 

DEFINITION: A real matrix A for which A 1 = A -1 is called an orthogonal matrix. 

We repeat Theorem 2.6, which characterizes the above matrices. 

THEOREM 13.8: The following conditions on a matrix A are equivalent: 

(i) A is unitary (orthogonal). 

(ii) The rows of A form an orthonormal set. 

(iii) The columns of A form an orthonormal set. 

13.7 Change of Orthonormal Basis 


Orthonormal bases play a special role in the theory of inner product spaces V. Thus, we are naturally 
interested in the properties of the change-of-basis matrix from one such basis to another. The following 
theorem (proved in Problem 13.12) holds. 

THEOREM 13.9: Let {u l ,... ,u n } be an orthonormal basis of an inner product space V. Then the 
change-of-basis matrix from {m,} into another orthonormal basis is unitary (ortho¬ 
gonal). Conversely, if P = [ay] is a unitary (orthogonal) matrix, then the following is 
an orthonormal basis: 

Wi = a u u i + a 2i u 2 H-b a m u n : i = 1,..., n} 

Recall that matrices A and B representing the same linear operator T are similar; that is, B = P l AP, 
where P is the (nonsingular) change-of-basis matrix. On the other hand, if V is an inner product space, we 
are usually interested in the case when P is unitary (or orthogonal) as suggested by Theorem 13.9. (Recall 
that P is unitary if the conjugate tranpose P* = P 1 , and P is orthogonal if the transpose P T = P 1 .) This 
leads to the following definition. 

DEFINITION: Complex matrices A and B are unitarily equivalent if there exists a unitary matrix P for 

which B = P*AP. Analogously, real matrices A and B are orthogonally equivalent if 
there exists an orthogonal matrix P for which B = P T AP. 

Note that orthogonally equivalent matrices are necessarily congruent. 
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13.8 Positive Definite and Positive Operators 

Let P be a linear operator on an inner product space V. Then 

(i) P is said to be positive definite if P = S*S for some nonsingular operators S. 

(ii) P is said to be positive (or nonnegative or semidefinite) if P = S*S for some operator S. 

The following theorems give alternative characterizations of these operators. 

THEOREM 13.10A: The following conditions on an operator P are equivalent: 

(i) P = T 2 for some nonsingular self-adjoint operator T. 

(ii) P is positive definite. 

(iii) P is self-adjoint and (P(u). u) > 0 for every u 0 in V. 

The corresponding theorem for positive operators (proved in Problem 13.21) follows. 

THEOREM 13.10B: The following conditions on an operator P are equivalent: 

(i) P — T 2 for some self-adjoint operator T. 

(ii) P is positive; that is, P = S*S. 

(iii) P is self-adjoint and ( P(u),u) > 0 for every u £ V. 

13.9 Diagonalization and Canonical Forms in Inner Product Spaces 


Let T be a linear operator on a finite-dimensional inner product space V over K. Representing T by a 
diagonal matrix depends upon the eigenvectors and eigenvalues of T, and hence, upon the roots of the 
characteristic polynomial A(f) of T. Now A(f) always factors into linear polynomials over the 
complex field C but may not have any linear polynomials over the real field R. Thus, the situation for 
real inner product spaces (sometimes called Euclidean spaces) is inherently different than the 
situation for complex inner product spaces (sometimes called unitary spaces). Thus, we treat them 
separately. 

Real Inner Product Spaces, Symmetric and Orthogonal Operators 

The following theorem (proved in Problem 13.14) holds. 

THEOREM 13.11: Let T be a symmetric (self-adjoint) operator on a real finite-dimensional product 
space V. Then there exists an orthonormal basis of V consisting of eigenvectors of 
7'; that is, T can be represented by a diagonal matrix relative to an orthonormal 
basis. 

We give the corresponding statement for matrices. 

THEOREM 13.11: (Alternative Form) Let A be a real symmetric matrix. Then there exists an 
orthogonal matrix P such that B = P l AP = P T AP is diagonal. 

We can choose the columns of the above matrix P to be normalized orthogonal eigenvectors of A; then 
the diagonal entries of B are the corresponding eigenvalues. 

On the other hand, an orthogonal operator T need not be symmetric, and so it may not be represented by 
a diagonal matrix relative to an orthonormal matrix. However, such a matrix T does have a simple 
canonical representation, as described in the following theorem (proved in Problem 13.16). 
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THEOREM 13.12: Let T be an orthogonal operator on a real inner product space V. Then there exists 
an orthonormal basis of V in which T is represented by a block diagonal matrix M 
of the form 


M — diag 



~In 


COS 0J 

sindj 


— sind, 

COS0, ’ 


cos 9 r — sin 9 r 
sin 9 r cos 9 r 


The reader may recognize that each of the 2x2 diagonal blocks represents a rotation in the 
corresponding two-dimensional subspace, and each diagonal entry —1 represents a reflection in the 
corresponding one-dimensional subspace. 


Complex Inner Product Spaces, Normal and Triangular Operators 


A linear operator T is said to be normal if it commutes with its adjoint—that is, if 7T* = T*T. We note 
that normal operators include both self-adjoint and unitary operators. 

Analogously, a complex matrix A is said to be normal if it commutes with its conjugate transpose—that 
is, if AA* = A*A. 


EXAMPLE 13.3 Let A = 


1 

i 


1 

3 + 2/ 


Then A* = 


— i 

3-2/ 


Also AA* = 


2 

3 + 3/ 


3-3/ 

14 


= A*A. Thus, A is normal. 


The following theorem (proved in Problem 13.19) holds. 


THEOREM 13.13: Let T be a normal operator on a complex finite-dimensional inner product space V. 

Then there exists an orthonormal basis of V consisting of eigenvectors of T ; that is, 
T can be represented by a diagonal matrix relative to an orthonormal basis. 

We give the corresponding statement for matrices. 


THEOREM 13.13: (Alternative Form) Let A be a normal matrix. Then there exists a unitary matrix P 
such that B = P l AP — P*AP is diagonal. 

The following theorem (proved in Problem 13.20) shows that even nonnormal operators on unitary 
spaces have a relatively simple form. 


THEOREM 13.14: Let T be an arbitrary operator on a complex finite-dimensional inner product space 
V. Then T can be represented by a triangular matrix relative to an orthonormal basis 
of V. 

THEOREM 13.14: (Alternative Form) Let A be an arbitrary complex matrix. Then there exists a 
unitary matrix P such that B = P 1 A P = P*AP is triangular. 


13.10 Spectral Theorem 


The Spectral Theorem is a refonnulation of the diagonalization Theorems 13.11 and 13.13. 

THEOREM 13.15: (Spectral Theorem) Let T be a normal (symmetric) operator on a complex (real) 
finite-dimensional inner product space V. Then there exists linear operators 
E u ...,E r on V and scalars ,..., X r such that 

(i) T = X i E l + X 2 E 2 3-F (iii) E\ — £j, E\ — E 2 ,..., E* — E n 

(ii) £!+£ 2 + ■••+£, = /, (iv) Efij = 0 for / + j. 
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The above linear operators £j ,..., E r are projections in the sense that Ej = E r Moreover, they are said 
to be orthogonal projections because they have the additional property that E i E ] = 0 for i j. 

The following example shows the relationship between a diagonal matrix representation and the 
corresponding orthogonal projections. 

EXAMPLE 13.4 Consider the following diagonal matrices A,E l ,E 2 ,Ey. 



The reader can verify that 

(i) A = 2£j + 3E 2 + 5 E 3 , (ii) Ei+E 2 + E 3 = I, (iii) Ej = E u (iv) £,£, = 0 for i ± j. 


SOLVED PROBLEMS 


Adjoints 

13.1. Find the adjoint of F:R 3 — R 3 defined by 

F(x,y, z) = ( 3x + 4y — 5z, 2x — 6v + 7z, 5x — 9y + z) 

First find the matrix A that represents F in the usual basis of R 3 —that is, the matrix A whose rows are 
the coefficients of x,y,z —and then form the transpose A r of A. This yields 

'3 4 -5] [ 3 2 5' 

A = 2 —6 7 and then A T = 4—6 —9 

5 -9 lj [-5 7 1 

The adjoint F * is represented by the transpose of A; hence, 

F*(x, y, z) = (3x + 2y + 5z, 4x - 6y - 9z, —5x + ly+z) 

13.2. Find the adjoint of G: C 3 — C 3 defined by 

G(x,y,z) = [2x + (1 — i)y, (3 + 2 i)x — 4iz, 2ix + (4 — 3 i)y — 3z] 

First find the matrix B that represents G in the usual basis of C 3 , and then form the conjugate transpose 
B* of B. This yields 


2 

1 — i 

0 " 


2 

3-2/ 

-2/ 

B= 3 + 2 i 

0 

-4/ 

and then 

B* = 1 + / 

0 

4 + 3/ 

2 i 

4-3/ 

-3 


0 

4/ 

-3 


Then G*(x,y,z) = [2x+ (3 — 2i)y — 2iz, (1 + i)x+ (4 + 3i)z, 4iy — 3z]. 


13.3. Prove Theorem 13.3: Let </> be a linear functional on an n-dirncnsional inner product space V. 
Then there exists a unique vector u 6 V such that (f> ( v) = (v. u) for every v G V. 

Let {w’i,..., w n } be an orthonormal basis of V. Set 

u = <l>(w 1 )w 1 + 4>(w 2 )w2 H-F 4>(w n )w n 

Let u be the linear functional on V defined by u(v) = (v, u) for every v G V. Then, for i = 


«Kj = (W{,u) = (Wf, (j){w 1 )w l H - F <kMw,j) = 
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Because u and (J) agree on each basis vector, u = <f>. 

Now suppose u! is another vector in V for which <f>(v) = (v, u!) for every v G V. Then (v, u) = (v, u') 
or (v, u — u') = 0. In particular, this is tme for v = u — u' , and so (u — u', u — u!) = 0. This yields 
u — u! = 0 and u = u'. Thus, such a vector u is unique, as claimed. 

13.4. Prove Theorem 13.1: Let T be a linear operator on an //-dimensional inner product space V. Then 

(a) There exists a unique linear operator T* on V such that 

( T(u ), v) = (u, T*(v)) for all u, v E V. 

(b) Let A be the matrix that represents T relative to an orthonormal basis S = { u t }. Then the 
conjugate transpose A* of A represents T* in the basis S. 

(a) We first define the mapping T*. Let v be an arbitrary but fixed element of V. The map u i—> ( T(u ), v) is 
a linear functional on V. Hence, by Theorem 13.3, there exists a unique element i/6V such 
that (T(u), v) = {u, vf) for every u £ V. We define T* : V —► V by T*(v) = if. Then 
(T(m), v) = (u , T*(v)) for every u, v G V. 

We next show that T* is linear. For any u. v t 6 V, and any a,b E K, 

(u, T*(av 1 + bv 2 )) = {T(u), av l + bv 2 ) = a{T(u), v x ) + b{T(u), v 2 ) 

= a(u, T*(v j)) + b(u, T*(v 2 )) = (u, aT*(vi) + bT*(v 2 )) 

But this is true for every u G V; hence, T*{av\ + bv 2 ) = aT*(y{) + bT*(v 2 ). Thus, T * is linear. 

(b) The matrices A = [a y ] and B = [by\ that represent T and T*, respectively, relative to the orthonormal 
basis S are given by = ( T(uj),u t ) and by = ( T*(uj),Uj ) (Problem 13.67). Hence, 

by = ( T *{Uj),Ui ) = (; u h T*{uj)) = (T(Uj), Uj) = a~j 

Thus, B = A*, as claimed. 

13.5. Prove Theorem 13.2: 

(i) (7\ + r 2 )* = Tf + Tf, (iii) {T{T 2 )* = T$Tf, 

(ii) (kT)* = It*, (iv) (: T *)* = T. 

(i) For any u, v G V, 

({Ti + T 2 ){u),v) = (Ti(u) + T 2 (u), v) = (Ti(u),v ) + (T 2 (u),v) 

= W, Tf(v)) + (u, T$(v)) = (m, Tf(v) + Tf (v)} 

= (u, (Tf + T f)(t/)) 

The uniqueness of the adjoint implies (T, + T 2 )* = Tf + Tf. 

(ii) For any u, v G V, 

((kT)(u ), v ) = ( kT(u),v) = k{T(u),v) = k(u,T*(v)) = (u~kT*{v)) = (u,(kT*)(v )) 

The uniqueness of the adjoint implies (kT)* = kT*. 

(iii) For any //, v G V, 

((T\T 2 )(u), v) = (T l (T 2 (u)), v) = (T 2 (u),Tf(v)) 

= (u,Tf(Tf(v))) = (u,(TfTf)(v)) 

The uniqueness of the adjoint implies (T, T 2 )* = TfTf. 

(iv) For any u, v G V, 


(T*(u),v) = (v,T*(u)) = (T(v), u) = (i u,T(v )) 


The uniqueness of the adjoint implies (T*)* = T. 
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13.6. Show that (a) I* = I, and (b) 0* = 0. 

(a) For every u, v G V, (I(u ), v) = ( u , v) = (u,I(v )); hence, /* = I. 

(b) For every u, v G V, (0(w), v) = (0, v) = 0 = (u,0) = (w,0(v)); hence, 0* = 0. 

13.7. Suppose T is invertible. Show that ( T _1 )* = 

1 = 1* = (7T- 1 )* = (T~ l )*T*; hence, (7’ -1 )* = . 

13.8. Let T be a linear operator on V. and let W be a '/'-invariant subspace of V. Show that W ‘ is 
invariant under T*. 

Let u G W l . If w G W, then T(w) G W and so (w, T*(u)) = ( T(w ), u) = 0. Thus, T*(u) G W 1 because 
it is orthogonal to every w G W. Hence, W 1 - is invariant under T*. 

13.9. Let T be a linear operator on V. Show that each of the following conditions implies 7 = 0: 

(i) (T(u), v) = 0 for every u, v G V. 

(ii) V is a complex space, and (T(u),u) = 0 for every u G V. 

(iii) T is self-adjoint and (T(u),u) = 0 for every u G V. 

Give an example of an operator T on a real space V for which ( T(u ), u) = 0 for every u G V but T ^ 0. 
[Thus, (ii) need not hold for a real space V.] 

(i) Set v = T(u). Then ( T(u ), T(u )) = 0, and hence, T(u) = 0, for every u G V. Accordingly, T = 0. 

(ii) By hypothesis, (T(v + w), v + w) = 0 for any v, w G V. Expanding and setting (T(v), v) = 0 and 
( T(w),w ) = 0, we find 

(T{v),w) + (T{w),v) = 0 (1) 

Note w is arbitrary in (1). Substituting iw for w, and using (T(v), iw ) = i{T{v),w) = —i(T(v), w ) and 
(T(iw ), «) = (iT(w), v) = i{T(w), v), we find 

~i{ T ( v ), w ) +<( r W>«) =0 

Dividing through by i and adding to (1), we obtain (T(w), v) = 0 for any u, w, G V. By (i), T = 0. 

(iii) By (ii), the result holds for the complex case; hence we need only consider the real case. Expanding 
(T(v + w), v + w) = 0, we again obtain (1). Because T is self-adjoint and as it is a real space, we 
have ( T(w),v ) = ( w,T(v )) = ( T(v),w ). Substituting this into (1), we obtain ( T(v),w) = 0 for any 
v,w G V. By (i), T= 0. 

For an example, consider the linear operator T on R 2 defined by T(x,y) = (y,—x). Then 
(T(u),u) = 0 for every n G V, but T ^ 0. 

Orthogonal and Unitary Operators and Matrices 

13.10. Prove Theorem 13.6: The following conditions on an operator U are equivalent: 

(i) U* = U~ x \ that is, \J is unitary, (ii) (U(v),U(w)) = (u,w). (iii) ||t/(n)|| = ||n||. 
Suppose (i) holds. Then, for every v,w,£ V, 

{U(v),U(w )) = (v, U*U(w)) = {v,I(w )) = {v,w) 

Thus, (i) implies (ii). Now if (ii) holds, then 

\\U(v)W = ^/(U(v),U(v)) = yj( V, V) = HI 

Hence, (ii) implies (iii). It remains to show that (iii) implies (i). 

Suppose (iii) holds. Then for every v G V, 

(U*U(v)) = (U(v), U(v)) = (v, v ) = (/(«), v) 

Hence, ((U*U — v) = 0 for every v G V. But U*U — I is self-adjoint (Prove!); then, by Problem 

13.9, we have U*U — 1 = 0 and so U*U = I. Thus, U* = U~ l , as claimed. 
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13.11. Let U be a unitary (orthogonal) operator on V, and let W be a subspace invariant under U. Show 
that W 1 - is also invariant under U. 

Because U is nonsingular, U(W) = W: that is, for any w G W, there exists w' G W such that 
U(w') = w. Now let v 6 W 1 . Then, for any w € W, 

{U(v),w) = {U(v), U(w')) = (v,W) = 0 

Thus, U(v) belongs to W^. Therefore, W 1 - is invariant under U. 

13.12. Prove Theorem 13.9: The change-of-basis matrix from an orthonormal basis ,...., u n } into 
another orthonormal basis is unitary (orthogonal). Conversely, if P — [aJ is a unitary (orthogo¬ 
nal) matrix, then the vectors U? = JT « /( h ; form an orthonormal basis. 

Suppose {ty} is another orthonormal basis and suppose 

v i = b ix u x + b i2 u 2 H- \-b in u n , i = 1 ,..., n (1) 

Because {v ; } is orthonormal, 

s ij = i v h v j) = b nbji + b a bj2 + • • • + b in b jn (2) 

Let B = [b i} ] be the matrix of coefficients in (1). (Then B J is the change-of-basis matrix from {«,} to 
{Vj}.) Then BB* = [cy\, where c t j = b n bj l + b j2 bj 2 + ■ ■ ■ + b in b jn . By (2), Cy = 5^, and therefore BB * = 7. 
Accordingly, B, and hence, B T , is unitary. 

It remains to prove that {m'} is orthonormal. By Problem 13.67, 

(«S> u j) = a \iO\j + «2i«2/ + ''' + a n flnj = ( c i, Cj ) 

where C, denotes the 7th column of the unitary (orthogonal) matrix P = [a^]. Because P is unitary ( orthogonal), 
its columns are orthonormal; hence, {«', Uj) = (Cj,Cj ) = S t j. Thus, {mJ} is an orthonormal basis. 

Symmetric Operators and Canonical Forms in Euclidean Spaces 

13.13. Let T be a symmetric operator. Show that (a) The characteristic polynomial A (t) of 7’ is a product 
of linear polynomials (over R); (b) T has a nonzero eigenvector. 

(a) Let A be a matrix representing T relative to an orthonormal basis of V\ then A = A r . Let A(f) be the 
characteristic polynomial of A. Viewing A as a complex self-adjoint operator, A has only real 
eigenvalues by Theorem 13.4. Thus, 

A (t) = (t-X 1 )(t-k)---(t-X n ) 

where the ). t are all real. In other words, A (t) is a product of linear polynomials over R. 

(b) By (a), T has at least one (real) eigenvalue. Hence, T has a nonzero eigenvector. 

13.14. Prove Theorem 13.11: Let T be a symmetric operator on a real 77 -dimensional inner product space 
V. Then there exists an orthonormal basis of V consisting of eigenvectors of T. (Hence, T can be 
represented by a diagonal matrix relative to an orthonormal basis.) 

The proof is by induction on the dimension of V. If dirnV = 1, the theorem trivially holds. Now 
suppose dim V = n > 1. By Problem 13.13, there exists a nonzero eigenvector v 1 of T. Let W be the space 
spanned by v 1? and let u { be a unit vector in W, e.g., let = «i/||t’ 1 ||. 

Because v t is an eigenvector of T, the subspace W of V is invariant under T. By Problem 13.8, W 1 ' is 
invariant under T* = T. Thus, the restriction T of T to W L is a symmetric operator. By Theorem 7.4, 
V = W © IVT Hence, dim W 1 = n — 1, because dim W = 1. By induction, there exists an orthonormal 
basis (m 2 , ..., m„} of W 1 consisting of eigenvectors of T and hence of T. But {u \, u,) = 0 for i = 2,..., n 
because u t € W . Accordingly {m 1; u 2 , ■ ■ ■, u n } is an orthonormal set and consists of eigenvectors of T. 
Thus, the theorem is proved. 
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13.15. Let q(x,y ) = 3x 2 — 6xy + 11 y 2 . Find an orthonormal change of coordinates (linear substitution) 
that diagonalizes the quadratic form q. 

Find the symmetric matrix A representing q and its characteristic polynomial A(f). We have 


3 -3 
-3 11 


and A(t) = t 2 — tr(A) t +\A\ = t 2 — 14 1 + 24 = (r — 2)(f — 12) 


The eigenvalues are 1 = 2 and X = 12. Hence, a diagonal form of q is 

q(s, t) = 2 s 2 + 12r 

(where we use 5 and t as new variables). The corresponding orthogonal change of coordinates is obtained by 
finding an orthogonal set of eigenvectors of A. 

Subtract 2 = 2 down the diagonal of A to obtain the matrix 


1 -3 
-3 9 


corresponding to 


v — 3y = 0 
—3x + 9y = 0 


or x — 3y — 0 


A nonzero solution is iq = (3,1). Next subtract X = 12 down the diagonal of A to obtain the matrix 


-9 -3 
-3 -1 


corresponding to 


—9x - 3y = 0 
—3x — y = 0 


or • 3x - y 0 


A nonzero solution is u 2 = (—1,3). Normalize rq and u 2 to obtain the orthonormal basis 

iq = (3/\/l0, l/y/IO), «2 = (-1/VT0, 3/ViO) 

Now let P be the matrix whose columns are lq and u 2 . Then 


_ 3/\/fd 

- 1 /vTo' 

, ... 2 0 
and D = P ] AP = P r AP = 

l/y/TO 

3/VTO 

0 12 


Thus, the required orthogonal change of coordinates is 


x =p s 
y t 


or x = 


One can also express s and t in terms of x and y by using P 1 = P T \ that is. 


3x+y 

n/Io' ’ 


-x+ 3y 


13.16. Prove Theorem 13.12: Let T be an orthogonal operator on a real inner product space V. Then 
there exists an orthonormal basis of V in which T is represented by a block diagonal matrix M of 
the form 


M = diag 1, • • •, 1,-1, • • ■, —1, 


cosdj —sindj 
sindj cosdj ’ 


cos 9 r — sind r 
sin 9,. cos 0, 


Let S = T + T~ l = T + T*. Then S* = (T + T*)* = T* + T = S. Thus, S is a symmetric operator on 
V. By Theorem 13.11, there exists an orthonormal basis of V consisting of eigenvectors of 5. If ,..., X m 
denote the distinct eigenvalues of S, then V can be decomposed into the direct sum V = V\ © V 2 © ■ • • © V m 
where the V) consists of the eigenvectors of S belonging to We claim that each V) is invariant under T. For 
suppose v G V; then S(v) = X t v and 

S(T(v)) = (T+T- l )T(v) = T(T + T~ l )(v) = TS(v) = T(X t v) = X{T{v) 

That is, T(v) G V). Hence, V) is invariant under T. Because the V) are orthogonal to each other, we can 
restrict our investigation to the way that T acts on each individual V r 

On a given V h we have (T + T~ l )v = S(v) = X t v. Multiplying by T, we get 


(T 2 _ XjT + 7)(d) = 0 
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We consider the cases A, = ±2 and A,- ^ ±2 separately. If A ; = ±2, then (T ± 7) 2 (d) = 0, which leads to 
(J ± I)(v) = 0 or T(v) = ±v. Thus, T restricted to this is either I or —I. 

If A ; ^ ±2, then T has no eigenvectors in V h because, by Theorem 13.4, the only eigenvalues of T are 1 
or —1. Accordingly, for v ^ 0, the vectors v and T{v ) are linearly independent. Let W be the subspace 
spanned by v and T(v). Then W is invariant under T, because using (1) we get 

T{T( v)) = T 2 (v) = 2{T( v) -veW 

By Theorem 7.4, V t = W © W . Furthermore, by Problem 13.8, W l is also invariant under T. Thus, we can 
decompose V, into the direct sum of two-dimensional subspaces Wj where the Wj are orthogonal to each 
other and each Wj is invariant under T. Thus, we can restrict our investigation to the way in which T acts on 
each individual Wj. 

Because T 2 — 2{T + 7 = 0, the characteristic polynomial A (?) of T acting on Wj is A(7) = t 2 — 2 f t + 1. 
Thus, the determinant of T is 1, the constant term in A(r). By Theorem 2.7, the matrix A representing T 
acting on Wj relative to any orthogonal basis of Wj must be of the form 

cos 9 — sin 9 

sin 9 cos 9 

The union of the bases of the Wj gives an orthonormal basis of V h and the union of the bases of the V ; gives 
an orthonormal basis of V in which the matrix representing T is of the desired form. 


Normal Operators and Canonical Forms in Unitary Spaces 

13.17. Determine which of the following matrices is normal: 


(a) 





i 

2 i 


(a) 


AA* = 


1 i 

1 

o' 


2 i 

0 1 

—i 

1 


—i 1 


A* A = 


1 

o' 

1 

i 


1 

i 

—i 

1 

0 

1 


—i 

2 


Because AA* ^ A*A, the matrix A is not normal. 


(b) 


BB* 


1 i 

i i 


2 2 + 2 i 


' 1 1 n 

1 i 

1 2 + / 

—i 2 — i 


2-2 i 6 


—i 2 — i 

1 2 + / 


Because BB* = B B. the matrix B is normal. 


13.18. Let T be a normal operator. Prove the following: 

(a) T(v) = 0 if and only if T*(v) = 0. (b) T — 21 is normal. 

(c) If T(v ) = At-, then T* ( v) = 2v; hence, any eigenvector of T is also an eigenvector of T*. 

(d) If T(v) — A, v and T(w) = A 2 w where A t ^ A 2 , then (v,w) = 0; that is, eigenvectors of T 
belonging to distinct eigenvalues are orthogonal. 

(a) We show that (T(v),T(v)) = (T*(v),T*(v)): 


(T(v),T(v)) = (v,T*T(v)) = (v, TT*(v)) = (T*(v),T*(v)) 


Hence, by [7 3 ] in the definition of the inner product in Section 7.2, T(v) = 0 if and only if T*(v) = 0. 
(b) We show that T — 21 commutes with its adjoint: 

(T - 21)(T - 21)* = {T - 21)(T* - 21) = TT* - 2T* -2T + 221 
= T*T — 2T — 2T* + 221 = (T* - 21)(T - 21) 

= {T - 2I)*(T - 21) 


Thus, T — 21 is normal. 



392 


CHAPTER 13 Linear Operators on Inner Product Spaces 


(c) If T(v') = Xv, then (T — XI)(y) = 0. Now T — XI is normal by (b); therefore, by (a), (T — XI)*(v) = 0. 
That is, (T* — XI)(v) = 0; hence, T*(v) = Xv. 

(d) We show that X x {v,w) = X 2 (v,w): 

Xi (v,w) = {X x v,w) = (T(v),w ) = (v,T*(w )) = (v,X 2 w) = X 2 (v,w) 

But X 2 , hence, (v,w) = 0. 

13.19. Prove Theorem 13.13: Let T be a normal operator on a complex finite-dimensional inner product 
space V. Then there exists an orthonormal basis of V consisting of eigenvectors of 7 . (Thus, T can 
be represented by a diagonal matrix relative to an orthonormal basis.) 

The proof is by induction on the dimension of V. If dim V = 1, then the theorem trivially holds. Now 
suppose dim V = n > 1. Because V is a complex vector space, T has at least one eigenvalue and hence a 
nonzero eigenvector v. Let W be the subspace of V spanned by v, and let u 1 be a unit vector in W. 

Because v is an eigenvector of T , the subspace W is invariant under T. However, v is also an 
eigenvector of T* by Problem 13.18; hence, W is also invariant under T*. By Problem 13.8, W L is invariant 
under T** = T. The remainder of the proof is identical with the latter part of the proof of Theorem 13.11 
(Problem 13.14). 

13.20. Prove Theorem 13.14: Let T be any operator on a complex finite-dimensional inner product space 
V. Then T can be represented by a triangular matrix relative to an orthonormal basis of V. 

The proof is by induction on the dimension of V. If dim V = 1, then the theorem trivially holds. Now 
suppose dim V = n > 1. Because V is a complex vector space, T has at least one eigenvalue and hence at 
least one nonzero eigenvector v. Let W be the subspace of V spanned by v, and let u x be a unit vector in W. 
Then u l is an eigenvector of T and, say, T{u x ) = a n u l . 

By Theorem 7.4, V = W © W- 1 . Let E denote the orthogonal projection V into W L . Clearly W L is 
invariant under the operator ET. By induction, there exists an orthonormal basis { u 2 ,... ,m„} of W 1 - such 
that, for i = 2 ,..., /?, 

ET(Uj) = a i2 u 2 f 3 »3 +-b 

(Note that {u t , u 2 ,..., u n } is an orthonormal basis of V.) But E is the orthogonal projection of V onto 
hence, we must have 

T{u t ) = a n ux + a j2 u 2 H-b a u u t 

for i = 2,... ,n. This with T(ux) = ci n U\ gives us the desired result. 

Miscellaneous Problems 

13.21. Prove Theorem 13.10B: The following are equivalent: 

(i) P = T 2 for some self-adjoint operator T. 

(ii) P — S*S for some operator .S’; that is, P is positive. 

(iii) P is self-adjoint and ( P{u),u) > 0 for every u G V. 

Suppose (i) holds; that is, P = T 1 where T = T*. Then P = TT = T*T, and so (i) implies (ii). Now 
suppose (ii) holds. Then P * = (S*5)* = = S*S = P, and so P is self-adjoint. Furthermore, 

(P(u),u) = (. S*S(u),u ) = (S(u),S(u)) > 0 

Thus, (ii) implies (iii), and so it remains to prove that (iii) implies (i). 

Now suppose (iii) holds. Because P is self-adjoint, there exists an orthonormal basis {m, ,..., u n } of V 
consisting of eigenvectors of P; say, P(u t ) = !,•«;. By Theorem 13.4, the A, are real. Using (iii), we show that 
the Xj are nonnegative. We have, for each i, 

0 < (P(Uj),U ( ) = (A;M;,M,-) = A,(m ; , Uj) 

Thus, (u h Uj) > 0 forces A ; > 0, as claimed. Accordingly, \[X\ is a real number. Let T be the linear operator 
defined by 

T(Uj) = \[X[Ui for i = 



CHAPTER 13 Linear Operators on Inner Product Spaces 


393 


Because T is represented by a real diagonal matrix relative to the orthonormal basis {«,}, T is self-adjoint. 
Moreover, for each i, 

T\u t ) = T(y/X iUi ) = yftTiit) = y/XtyflfUi = /,«,■ = P(u ,) 

Because T 2 and P agree on a basis of V,P = T 2 . Thus, the theorem is proved. 

Remark: The above operator T is the unique positive operator such that P = 7’ 2 : it is called the 
positive square root of P. 

13.22. Show that any operator T is the sum of a self-adjoint operator and a skew-adjoint operator. 

Set S = \ (T + T*) and U = \{T - T*). Then T = S + U, where 

S* = [\{T + 7*)]* = \ (T* + T**) = i (T* + T) = S 
U* = [\(T - 7*)]* = \ (T* — T) = — \(T — T*) = —U 

that is, S is self-adjoint and U is skew-adjoint. 

13.23. Prove: Let T be an arbitrary linear operator on a finite-dimensional inner product space V. Then T 
is a product of a unitary (orthogonal) operator U and a unique positive operator P; that is, 
T = UP. Furthermore, if T is invertible, then U is also uniquely determined. 

By Theorem 13.10, T*T is a positive operator; hence, there exists a (unique) positive operator P such 
that P 2 = T*T (Problem 13.43). Observe that 

Mil 2 = (P{v),P(v)) = (P 2 ( v),v) = (T*T(v),v) = (T(v),T(v)) = ||T( V )|| 2 (1) 

We now consider separately the cases when T is invertible and noninvertible. 

If T is invertible, then we set U = PT~ l . We show that U is unitary: 

U* = (PT~ X )* = T~ U P* = {T*)~ l P and U* U = (T*)~ l PPT~ l = (T*)~ l T*TT~' = I 

Thus, U is unitary. We next set U = U~ 1 . Then U is also unitary, and T = UP as required. 

To prove uniqueness, we assume T = U 0 P 0 , where U Q is unitary and P 0 is positive. Then 

T*T = P$U$U 0 P 0 = PqIPq = Pi 

But the positive square root of T*T is unique (Problem 13.43); hence, P 0 = P. (Note that the invertibility of 
T is not used to prove the uniqueness of P.) Now if T is invertible, then P is also invertible by (1). 
Multiplying U 0 P = UP on the right by P~ l yields U 0 = U. Thus, U is also unique when T is invertible. 
Now suppose T is not invertible. Let W be the image of P; that is, W = Im P. We define U l :W —> V by 

Ui(w) = T(v), where P(v) = w (2) 

We must show that U t is well defined; that is, that P(v) = P(i/) implies T(v) = T( t/). This follows from 
the fact that P(v — t/) = 0 is equivalent to || P(v — vf) || = 0, which forces \\T(v — t/)|| = 0 by (1). Thus, U t 
is well defined. We next define U 2 '-W —> V. Note that, by (1), P and T have the same kernels. Hence, the 
images of P and T have the same dimension; that is, dim(Im P) = dim W = dim(Im T). Consequently, W 1 
and (Im T j" 1 also have the same dimension. We let U 2 be any isomorphism between W 1 - and (Im T ) x . 

We next set U = U t (B U 2 . [Here U is defined as follows: If v G V and v = w + vv', where w 6 W, 
w' G W J *, then U(v) = U l (w) + U 2 (w').] Now U is linear (Problem 13.69), and, if v G V and P(v) = w, 
then, by (2), 

T(v) = U l (w) = [/(w) = UP{v) 

Thus, T = UP, as required. 

It remains to show that U is unitary. Now every vector xGV can be written in the form x = P(v) + w', 
where w' G W ± . Then U(x) = UP(v) + U 2 (w') = T(v) + U 2 (w'), where {T(v), U 2 (w')) = 0 by definition 



394 


CHAPTER 13 Linear Operators on Inner Product Spaces 


of U 2 . Also, (T(v),T(v)) = (P(v),P(v )) by (1). Thus, 

(U(x), U(x)) = (T(v) + U 2 (w'), T(v) + U 2 (w')) = (T(v), T(v)) + {U 2 (w'), U 2 (w')) 
= {P(v),P(v)) + (w\w') = (P(v) + W, P{v) + w') = (x,x) 

[We also used the fact that (P(v),w'} = 0.] Thus, U is unitary, and the theorem is proved. 


13.24. Let V be the vector space of polynomials over R with inner product defined by 

ft 


(f,g) 


f(t)g(t) dt 
o 


Give an example of a linear functional t/> on V for which Theorem 13.3 does not hold—that is, for 
which there is no polynomial h{t ) such that </>(/) = (/, h) for every / £ V. 

Let 4>:V —*■ R be defined by < j>{f) =/(0); that is, cf> evaluates/(f) at 0, and hence maps/(f) into its 
constant term. Suppose a polynomial h{t) exists for which 


m =/(o) 


f(t)h(t) dt 


o 


(i) 


for every polynomial/(f). Observe that 0 maps the polynomial tf(t) into 0; hence, by (1), 

f 1 

dt — 0 (2) 

Jo 


for every polynomial/(f). In particular (2) must hold for/(f) = th(t)\ that is, 

f 1 

f 2 /z 2 (f) dt = 0 

Jo 


This integral forces h(t) to be the zero polynomial; hence, 4>{f) = {f,h) = (f, 0) = 0 for every 
polynomial/(f). This contradicts the fact that 4> is not the zero functional; hence, the polynomial h(t) 
does not exist. 


SUPPLEMENTARY PROBLEMS 


Adjoint Operators 

13.25. Find the adjoint of: 

1 r 

2 3 

13.26. Let T:R 3 —> R 3 be defined by z) = (a: + 2y, 3x — 4z, y). Find T*(x,y,z)- 

13.27. Let T :C 3 —> C 3 be defined by T(x, y, z) = [ix + (2 + 3 i)y, 3x + (3 — i)z, (2 — 5 i)y + iz] ■ 

Find T*(x,y,z)- 

13.28. For each linear function 4> on V, find u £ V such that <f>(v) = (v, u ) for every v £ V: 

(a) 4> :R 3 —> R defined by (f>(x,y, z) = x+2y — 3 z. 

(b) <t> :C 3 —> C defined by <j)(x, y, z) = ix + (2 + 3 i)y + (1 — 2 i)z. 

13.29. Suppose V has finite dimension. Prove that the image of T* is the orthogonal complement of the kernel of T ; 
that is, Im T * = (Ker T) L . Hence, rank(T) = rank(T*). 

13.30. Show that T*T = 0 implies T = 0. 


(a) A = 


5-2 i 3 + li 
4 — 6 i 8 + 3 i 


(b) B = 


5 i 
-2 i 


(c) C = 
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13.31. Let V be the vector space of polynomials over R with inner product defined by {f,g) = L f(t)g(t) dt. Let 
D be the derivative operator on V, that is, D(/) = df /dt. Show that there is no operator D* on V such that 
(D( f),g) = (f,D*(g)) for every /,g G V. That is, D has no adjoint. 


Unitary and Orthogonal Operators and Matrices 

13.32. Find a unitary (orthogonal) matrix whose first row is 

(a) (2/\/l3, 3/vT 3), (b) a multiple of (1, 1 — i), (c) a multiple of (1, —i, 1 — i). 

13.33. Prove that the products and inverses of orthogonal matrices are orthogonal. (Thus, the orthogonal matrices 
form a group under multiplication, called the orthogonal group.) 

13.34. Prove that the products and inverses of unitary matrices are unitary. (Thus, the unitary matrices form a group 
under multiplication, called the unitary group.) 

13.35. Show that if an orthogonal (unitary) matrix is triangular, then it is diagonal. 

13.36. Recall that the complex matrices A and B are unitarily equivalent if there exists a unitary matrix P such that 
B = P*AP. Show that this relation is an equivalence relation. 

13.37. Recall that the real matrices A and B are orthogonally equivalent if there exists an orthogonal matrix P such 
that B = P t AP. Show that this relation is an equivalence relation. 

13.38. Let W be a subspace of V. For any v G V, let v = w + w 1 , where w G W, w' G W L . (Such a sum is unique 
because V = W © W^.) Let T:V —> V be defined by T(y) = w — w'. Show that T is self-adjoint unitary 
operator on V. 

13.39. Let V be an inner product space, and suppose U:V —> V (not assumed linear) is surjective (onto) and 
preserves inner products; that is, (U(v), U(w )) = {u. w) for every v,w G V. Prove that U is linear and hence 
unitary. 

Positive and Positive Definite Operators 

13.40. Show that the sum of two positive (positive definite) operators is positive (positive definite). 

13.41. Let T be a linear operator on V and let/:F x V —> K be defined by /(n, v) = ( T{u ), v). Show that/ is an 
inner product on V if and only if T is positive definite. 

13.42. Suppose E is an orthogonal projection onto some subspace W of V. Prove that kl + E is positive (positive 
definite) if k > 0 (k > 0). 

13.43. Consider the operator T defined by T(u ,) = a i = 1,..., n, in the proof of Theorem 13.10A. Show that 
T is positive and that it is the only positive operator for which T 2 = P. 


13.44. Suppose P is both positive and unitary. Prove that P = 7. 


13.45. Determine which of the following matrices are positive (positive definite): 


1 1 
1 1 


. (ii) 


0 i 
-i 0 


(hi) 


0 1 

-1 0 


. (iv) 


1 1 
0 1 


(v) 


2 1 
1 2 


(vi) 


1 2 
2 1 


13.46. Prove that a 2 x 2 complex matrix A = 


a b 
c d 


is positive if and only if (i) A = A*, and (ii) a, d and 


|A| = ad — be are nonnegative real numbers. 
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13.47. Prove that a diagonal matrix A is positive (positive definite) if and only if every diagonal entry is a 
nonnegative (positive) real number. 


Self-adjoint and Symmetric Matrices 

13.48. For any operator T, show that T + T* is self-adjoint and T — T* is skew-adjoint. 

13.49. Suppose T is self-adjoint. Show that T 2 (v) = 0 implies T(v) = 0. Using this to prove that T n (v) = 0 also 
implies that T(v) = 0 for n > 0. 

13.50. Let V be a complex inner product space. Suppose (T(v), v) is real for every v £ V. Show that T is self- 
adjoint. 

13.51. Suppose 7j and T 0 are self-adjoint. Show that 7’ l 7’ 7 is self-adjoint if and only if 7j and 7j commute; that is, 
T\ T i = T 2 T\. 

13.52. For each of the following symmetric matrices A, find an orthogonal matrix P and a diagonal matrix D such 
that P t AP is diagonal: 



'1 2' 


'5 4' 


‘7 3‘ 

II 

2 -2 

II 

2 

4 -1 

II 

'o' 

3 -1 


13.53. Find an orthogonal change of coordinates X = PX' that diagonalizes each of the following quadratic forms 
and find the corresponding diagonal quadratic form q{x!)\ 

(a) q{x,y) = 2x~ - 6xy + 10y 2 , (b) q{x,y) = jc 2 + 8xy — 5y 2 
(c) q(x, y, z) = 2x 2 - 4xy + 5y 2 + 2xz - 4yz + 2z 2 


Normal Operators and Matrices 

'2 i 


13.54. Let A = 


i 2 


. Verify that A is normal. Find a unitary matrix P such that P*AP is diagonal. Find P*AP. 


13.55. Show that a triangular matrix is normal if and only if it is diagonal. 


13.56. Prove that if T is normal on V, then ||7’(v)|| = ||T*(i;)|| for every v G V. Prove that the converse holds in 
complex inner product spaces. 


13.57. Show that self-adjoint, skew-adjoint, and unitary (orthogonal) operators are normal. 


13.58. Suppose T is normal. Prove that 

(a) T is self-adjoint if and only if its eigenvalues are real. 

(b) T is unitary if and only if its eigenvalues have absolute value 1. 

(c) T is positive if and only if its eigenvalues are nonnegative real numbers. 


13.59. Show that if T is normal, then T and T* have the same kernel and the same image. 

13.60. Suppose 7j and T 2 are normal and commute. Show that 7j + T 2 and 7jr 2 are also normal. 


13.61. Suppose 7j is normal and commutes with T 2 . Show that 7j also commutes with Lf. 

13.62. Prove the following: Let 7j and T 2 be normal operators on a complex finite-dimensional vector space V. 
Then there exists an orthonormal basis of V consisting of eigenvectors of both 7j and T 2 . (That is, 7j and T 2 
can be simultaneously diagonalized.) 


Isomorphism Problems for Inner Product Spaces 

13.63. Let S = {zq,..., u n } be an orthonormal basis of an inner product space V over K. Show that the mapping 
v i—> [i;]^ is an (inner product space) isomorphism between V and K n . (Flere [d] s denotes the coordinate 
vector of v in the basis S .) 
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13.64. Show that inner product spaces V and W over K are isomorphic if and only if V and W have the same 
dimension. 

13.65. Suppose {u l ,..., u n ) and {u \,..., u' n } are orthonormal bases of V and W, respectively. Let T:V —> W be 
the linear map defined by T(u t ) = «' for each i. Show that T is an isomorphism. 

13.66. Let V be an inner product space. Recall that each u G V determines a linear functional u in the dual space V* 
by the definition u(v ) = (v, u) for every v G V. (See the text immediately preceding Theorem 13.3.) Show 
that the map u i—> u is linear and nonsingular, and hence an isomorphism from V onto V*. 

Miscellaneous Problems 

13.67. Suppose {«[, ..., u n \ is an orthonormal basis of V. Prove 

(a) (dji*! + a 2 u 2 H-h a n u n , b x u x + b 2 u 2 H-b b n u„) = a l b 1 + a 2 b 2 + ... a„b n 

(b) Let A = [a- ; -] be the matrix representing T: V —> V in the basis {«,}. Then a t j = (T(Uj),Uj). 

13.68. Show that there exists an orthonormal basis {u t of V consisting of eigenvectors of T if and only if 

there exist orthogonal projections E l ,... ,E r and scalars .... ,). r such that 

(i) T = + • • • + ). r E r , (ii) E l + ---+E r =I, (iii) E t Ej = 0 for i ± j 

13.69. Suppose V = U © W and suppose 7j:[/ —>• V and T 2 :W —>• V are linear. Show that T = 7\ © T 2 is also 
linear. Here T is defined as follows: If v e V and v = u + w where u G U, w G W, then 

T(v) = T 1 (u) + T 2 (w) 


ANSWERS TO SUPPLEMENTARY PROBLEMS 


Notation: [R t ; R 2 ; ...; R„] denotes a matrix with rows R\,R 2 ,... ,R n . 

13.25. (a) [5 + 24 4 + 6 i; 3-7 i, 8 - 3i], (b) [3, — i; -54 2i], (c) [1,2; 1,3] 

13.26. T*(x,y,z) = (x+ 3y, 2x + z, -4y) 

13.27. T*(x,y,z) = [~ix+ 3y, (2 - 3i)x+ (2 + 5i)z, (3 + i)y-iz] 

13.28. (a) u = (1,2, —3), (b) u=(-i, 2-3 i, 1 + 2 i) 

13.32. (a) (l/v/l3)[2,3; 3,-2], (b) (1/V3)[1, 1 - i; 1+4 -1], 

(c) i[l, H, 1-4 y/2i, -A 0; 1, — 1 + i] 

13.45. Only (i) and (v) are positive. Only (v) is positive definite. 

13.52. (a and b) P = (1/A)[2,-1; 1,2], (c) P = (l/y/l0)[3,-1; 1,3] 

(a) D=[ 2,0; 0,-3], (b) D=[7,0; 0,-3], (c) D=[8,0; 0,-2] 

13.53. (a) x = (3x' - /)/v/l0, y = (N + 3 y') /v/lO; (b) x = (ltd - /)/ \/5, y = (x' + 2y')/v/5; 
(c) x = x7A + //A + z 7A y = x , /v / 3 — It!IV6, z = td /v3 - y'/A + z' /x/6; 

(a) A) = diag(l, 11); (b) ?(x / ) = diag(3, -7); (c) q'(x') = diag(l, 17) 

13.54. (a) P= (1/A)[1,-1; 1,1], P*AP = diag(2 + 4 2 — *) 





APPENDIX A 



Multilinear Products 


A.l Introduction 


The material in this appendix is much more abstract than that which has previously appeared. Accordingly, 
many of the proofs will be omitted. Also, we motivate the material with the following observation. 

Let S be a basis of a vector space V. Theorem 5.2 may be restated as follows. 


THEOREM 5.2: Let g: S —> V be the inclusion map of the basis S into V. Then, for any vector space U 
and any mapping/ :S — U. there exists a unique linear mapping/*: V —> U such that 

f=r-g- 

Another way to state the fact that / =/*■ g is that the diagram in Fig. A-1(a) commutes. 


T=V®W 


x \/* 

8 


8 

"v 


t., 



f 

(a) 


U VxW. 


f 
(b) 

Figure A-l 


E = A r V 

V- 


f 

(c) 


A.2 Bilinear Mapping and Tensor Products 


Let U, V, W be vector spaces over a field K. Consider a map 
f:VxW^U 

Then /is said to be bilinear if, for each v £ V. the map /',: IT —v U defined by f v (w) —f(v,w) is linear; 
and, for each w € W, the map f w :V —> U defined by f w (v) is linear. 

That is,/is linear in each of its two variables. Note that/is similar to a bilinear form except that the 
values of the map / are in a vector space U rather than the field K. 

DEFINITION A.l: Let V and W be vector spaces over the same field K. The tensor product of V and W 

is a vector space T over K together with a bilinear map g : V x W —> T, denoted by 
g(v, vv) = v ® vv. with the following property: (*) For any vector space U over K 
and any bilinear map / :V x W —> U there exists a unique linear map /*: T —> U 
such that f* ■ g— f■ 

The tensor product (T, g) [or simply T when g is understood] of V and W is denoted by V 0 W. and the 
element v <S> w is called the tensor of v and vv. 

Another way to state condition (*) is that the diagram in Fig. A-1(b) commutes. The fact that such 
a unique linear map f* exists is called the “Universal Mapping Principle” (UMP). As illustrated in 
Fig. A-1(b), condition (*) also says that any bilinear map f:VxW^U “factors through” the tensor product 
T = V <g) W. The uniqueness in (*) implies that the image of g spans T: that is, span ({v 0 vv}) = T. 
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THEOREM A.l: (Uniqueness of Tensor Products) Let (T , g ) and (T\ g') be tensor products of V 
and W. Then there exists a unique isomorphism h:T —> T' such that hg = g'. 

Proof. Because T is a tensor product, and g' : V ® W —> T' is bilinear, there exists a unique linear map 
h:T —* T' such that hg = g' . Similarly, because T is a tensor product, and g : V <8> W —* T' is bilinear, 
there exists a unique linear map It : V — T such that h'g' = g. Using hg = g' , we get h'hg = g. Also, 
because T is a tensor product, and g : V ® IV — 7’ is bilinear, there exists a unique linear map h* :T — T 
such that h*g = g. But l T g = g. Thus, h'h = If = 1 T . Similarly, hh! = 1 T <. Therefore, h is an isomorph¬ 
ism from T to T. 


THEOREM A.2: (Existence of Tensor Product) The tensor product T — V <S> W of vector spaces V 
and W over K exists. Let {n 1; ..., v m } be a basis of V and let {vv,, ..., w„} be a 
basis of W. Then the mn vectors 


Vi<S>Wi (i= 1, m; j= 1, ..., n) 


form a basis of T. Thus, dim T = mn = (dim V)(d \m W). 

Outline of Proof. Suppose {n 1 ..., v m \ is a basis of V, and suppose {vtq, ..., w n } is a basis of W. 
Consider the mn symbols { t v .\i = m. j = I, ..., «}. Let 7’ be the vector space generated by the f ;/ . 

That is, T consists of all linear combinations of the f (/ with coefficients in K. [See Problem 4.137.] 

Let v £ V and w £ W. Say 

v = a l v l + a 2 v 2 + ■ ■ ■ + a m v m and w = b]W i J r b 2 w 2 + • ■ ■ + b m w m 

Let g: V x W —> T be defined by 

g(v, w) 

< j 


Then g is bilinear. [Proof left to reader.] 

Now let/: V x W ^ U be bilinear. Because the /■ form a basis of T. Theorem 5.2 (stated above) tells 
us that there exists a unique linear map/* :T —» U such that/* (/j = f{v b wf . Then, for v = a f u i an d 
w = bjWj, we have ' 




Y.Y, a ‘ b jf( v o wj) = =/*(*(«> w ))- 


Therefore, / =f*g where/* is the required map in Definition A.L Thus, 7’ is a tensor product. 

Let { v\. ..., v' m } be any basis of V and {w\ , ..., w' m ) be any basis of W. 

Let v € V and w G W and say 

v -a\ v\ - -- a' m v' m and w = b\w\ +- h b' m W m 

Then 

V 0 w — g(v, w ) = E E a 'rt s( v 'o wj) = E E a 'i b 'M ® w 'i ) 

i j i j 


Thus, the elements /■ <g> vv' span T. There are mn such elements. They cannot be linearly dependent 
because {/)}is a basis of 7, and hence, dim T — mn. Thus, the v\ ® vv' form a basis of T. 

Next we give two concrete examples of tensor products. 


EXAMPLE A.l Let V be the vector space of polynomials P,._ t (x) and let W be the vector space of polynomials 
P ( -1 (y). Thus, the following from bases of V and W, respectively, 

1, x, x 2 , ..., x r ~ l and 1, y, y 2 , ..., y*- 1 

In particular, dim V — r and dim W = s. Let 7’ be the vector space of polynomials in variables x and y with 
basis 


{x'y -’} where 7 = 0, 1, ..., r — 1 ; j — 0, 1, ..., s — 1 
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Then T is the tensor product V ® W under the mapping 

x 1 0 y j = x‘y‘ 

For example, suppose v — 2 — 5x + 3x 3 and w — ly + 4v 2 . Then 
v 0 w — 14 y + 8y 2 — 35 xy — 20 xy 2 + 21 x'y + 12x 3 _y 2 
Note, dim T = rs = (dim V)(dim W). 

EXAMPLE A.2 

Let V be the vector space of m x n matrices over a field K and let W be the vector space of p x q matrices 
over K. Suppose A = [a n ] belongs to V, and B belongs to W. Let T be the vector space of mp x nq 
matrices over K. Then T is the tensor product of V and W where A 0 B is the block matrix 




a l2 B 

■ ' a \n B ' 

A®B— [cijjB\ — 

a 2\B 

a 22 B 

■ ■ a 2n B 


- a ,n\ B 

a m2 B ■ 

d mn B - 


For example, suppose A 


1 

3 


2 

4 


andB = 


1 

4 


2 

5 


3 

6 


. Then 




- 1 2 3 2 4 6' 

4 5 6 8 10 12 

3 6 9 4 8 12 

.12 15 18 16 20 24. 


Isomorphisms of Tensor Products 

First we note that tensoring is associative in a cannonical way. Namely, 

THEOREM A.3: Let U, V, W be vector spaces over a field K. Then there exists a unique isomorphism 

(U 0 V) 0 IT -> U 0 (V 0 W) 
such that, for every u E U, v E V, w E W, 

(m 0 u) 0 W I * u 0 (v 0 w) 

Accordingly, we may omit parenthesis when tensoring any number of factors. Specifically, given 
vectors spaces V 1 , V 2 , ■ ■ ■, V m over a held K, we may unambiguously form their tensor product 

V x 0 V 2 0 ... 0 V m 

and, for vectors v. in V ] , we may unambiguously form the tensor product 

Vi 0 v 2 0 • • • 0 v m 

Moreover, given a vector space V over K, we may unambiguously define the following tensor 
product: 

0 r V = V 0 V 0 ... 0 V (r factors) 

Also, there is a canonical isomorphism 

(0 r V) 0 (0 S Y) 0 r+ "V 

Furthermore, viewing K as a vector space over itself, we have the canonical isomorphism 
K0 V -> V 


where we define a 0 v = av. 
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A.3 Alternating Multilinear Maps 


Let /: V r —> U where V and U are vector spaces over K. [Recall V r — V x V x ... x V, r factors.] 

(1) The mapping/is said to be multilinear or /--linear if/fu,, ..., v r ) is linear as a function of each vj 
when the other n,’s are held fixed. That is, 

/(..., Vj + l/j, Vj, ...) +/(• •) 

/(..., kVj, = Vj, ...) 

where only the jth position changes. 

(2) The mapping / is said to be alternating if 

f(v 1 , ..., v r ) = 0 whenever v t = Vj with i ^ j 
One can easily show (Prove!) that if /is an alternating multilinear mapping on V, then 
/(..., v u Vj, ...) = -/(. Vj, v h . . .) 

That is, if two of the vectors are interchanged, then the associated value changes sign. 

EXAMPLE A.3 (Determinants) 

The determinant function D\M —> K on the space M of n x n matrices may be viewed as an //-variable function 

D(A) — D(R 1 , R 2 , ■ ■ ■, R n ) 

defined on the rows R x , R 2 , ..., R n of A. Recall (Chapter 8) that, in this context, D is both n-linear and alternating. 

We now need some additional notation. Let K — [k x , k 2 , ■ ■ ■, k r \ denote an r-list (r-tuple) of elements 
from I n — ( 1, 2, ..., n). We will then use the following notation where the v k ’s denote vectors and the 
a ik s denote scalars: 

% = (%, %,■■■, %) and a K = a lk a 2kl ■ ■ ■ a rK 

Note v K is a list of r vectors, and a K is a product of r scalars. 

Now suppose the elements in K = [k t , k 2 , ..., k r ] are distinct. Then A" is a permutation a K of an r-list 
J = [f|, i 2 , ..., /] in standard form, that is, where / < i 2 < ... < i r . The number of such standard-form 
r-lists J from /„ is the binomial coefficient: 



[Recall signfcr^) = (— I )'" s ' where m K is the number of interchanges that transforms K into J.] 
Now suppose A — is an r x n matrix. For a given ordered r-list J, we define 



a u, 

a u 2 ■ 

• • a \i r 

DM) = 

a 2 /, 

<*2i 2 ■ 

■ ■ «2«, 



«n 2 • 

.. a rir 


That is, Dj(A) is the determinant of the r x r submatrix of A whose column subscripts belong to J. 

Our main theorem below uses the following “shuffling” lemma. 

LEMMA A.4 Let V and U be vector spaces over K, and let f:V r —> U be an alternating r- linear 
mapping. Let t/j, v 2 , ..., //„ be vectors in V and let A = [a (; ] be an r x n matrix over K 
where r < n. For i — 1, 2, ..., r, let 


Uf = a n Vi + a i2 v 2 + ■ ■ ■ + a in v, 
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Then 

/(«i, • • •, u r) = £^(A)/K., v k , ..., ?/) 

/ 

where the sum is over all standard-form r-lists -/ = {/, i 2 , ■ ■ ■, /} ■ 


The proof is technical but straightforward. The linearity of / gives us the sum 

f(u u u r ) = £a*/(%) 

K 

where the sum is over all r-lists K from {1, ..., n }. The alternating property of/tells us that f(v K ) = 0 
when K does not contain distinct integers. The proof now mainly uses the fact that as we interchange the 
Vj s to transform 

/(%) =f(%, %, ■■■, %) t0 f{vj) =f{v h , v h , v ir ) 

so that i 1 < ■ ■ ■ < i r , the associated sign of a K . will change in the same way as the sign of the 
corresponding permutation o K changes when it is transformed to the identity permutation using 
transpositions. 

We illustrate the lemma below for r = 2 and n — 3. 

EXAMPLE A.4 Suppose f\V 2 ^U is an alternating multilinear function. Let v l , v 2 , 1136V and let u, vv e V. 
Suppose 

u = + a 2 v 2 + a 2 v 3 and w = b l v l + b 2 v 2 + b 2 v 2 

Consider 

f(u,w ) = f{a l v l + a 2 v 2 + a 3 u 3 , b l v l + b 2 v 2 + b 3 v 3 ) 

Using multilinearity, we get nine terms: 

f(u,w) = a x bj{v u v r ) a x b 2 f{v u v 2 ) f u,/; 3 /( r,. r 3 ) 

+ a 2 bj{v 2 , v { ) + a 2 b 2 f(v 2 , v 2 ) + a 2 b 3 f(v 2 , v 3 ) 

+ a 3 bj(v 3 , ^3) + a 3 b 2 f(v 3 , v 2 ) + a 3 b 3 f(v 3 , v 3 ) 

(Note that J = [1,2], J' = [1,3] and J" = [2,3] are the three standard-form 2-lists of I = [1,2,3].) The 
alternating property of/tells us that each/(n,, v t ) = 0; hence, three of the above nine terms are equal to 0. 
The alternating property also tells us that f[v t , vA = —f(vr, v r ) . Thus, three of the terms can be 
transformed so their subscripts form a standard-form 2-list by a single interchange. Finally we obtain 

f(u,w) = {a x b 2 - a 2 b x )f{v u v 2 ) + {a x b 3 - a 3 bi)f(v u v 3 ) + (a 2 b 3 - a 3 b 2 )f(v 2 ,v 3 ) 

f(v 2, 

which is the content of Lemma A.4. 


a x a 2 
b, b 2 


f(vi,v 2 ) + 


a { a 3 
b\ b 3 


f(v 1,^3) 


a 2 a 3 
b 2 b 3 


A.4 Exterior Products 


The following definition applies. 

DEFINITION A.2: Let The an n-dimensionmal vector space over a field K, and let rbe an integer such 
that 1 < r < n. The r-fold exterior product (or simply exterior product when r is 
understood) is a vector space E over K together with an alternating r- linear mapping 
g: V r —* E, denoted by gf'c,, ..., v r ) — v x A ... A v r , with the following property: 
(*) For any vector space U over K and any alternating r-linear map/: V r —> U there 
exists a unique linear map /*: E —> U such that /* ■ g =f. 
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The r-fold tensor product (E, g) (or simply E when g is understood) of V is denoted by A r V. and the 
element v { A ■ ■ ■ A v r is called the exterior product or wedge product of the v? s. 

Another way to state condition (*) is that the diagram in Fig. A-l(c) commutes. Again, the fact that such 
a unique linear map/* exists is called the “Universal Mapping Principle (UMP)”. As illustrated in Fig. A-l 
(c), condition (*) also says that any alternating r-linear map /: V r U “factors through" the exterior 
product E = A' V. Again, the uniqueness in (*) implies that the image of g spans E; that is, 
span(n 1 A • • • A v r ) — E. 


THEOREM A.5: (Uniqueness of Exterior Products) Let (E, g) and (E', g') be r-fold exterior products 
of V. Then there exists a unique isomorphism h :E —> E' such that hg — g 1 . 

The proof is the same as the proof of Theorem A. 1, which uses the UMP. 

THEOREM A. 6 : (Existence of Exterior Products) Let V be an //-dimensional vector space over K. 

Then the exterior product E = A r V exists. If r > n. then E = {0}. If r < n, then 

dim E = . Moreover, if [n 1; ..., v n ] is a basis of V, then the vectors 

v h A v h A • • • A v ir , 

where I < /, </<■■■< i r < n, form a basis of E. 

We give a concrete example of an exterior product. 

EXAMPLE A.5 (Cross Product) 

Consider V = R 3 with the usual basis (i, j, k). Let E = A V. Note dim V = 3. Thus, dim E = 3 with basis 
iA j, iA k, j A k. We identify E with R under the correspondence 

i = j A k, j = k A i = — i A k, k = i A j 

Let u and w be arbitrary vectors in V = R 3 , say 

u — (a 1 , a 2 , a 3 ) = a l i + a 2 j + 03k and w = (b l , b 2 , 63) — b x i + b 2 j + b 2 k 
Then, as in Example A. 3, 

u A w = {a x b 2 — a 2 b 1 ) (i A j) + {a x b 3 — £7 3 Z? 1 )(i Ak) + (a 2 b 3 — a 3 b 2 )(j Ak) 

Using the above identification, we get 

uAw— (a 2 b 3 — a 3 b 2 ) i — {a x b 3 — a 3 b x )] + (o l b 2 — a 2 b x )\i 


a 2 a 3 


a { a 3 


a { a 2 


i — 


j + 


b 2 h 


bi b 3 


bi b 2 


The reader may recognize that the above exterior product is precisely the well-known cross product 
in R 3 . 

Our last theorem tells us that we are actually able to “multiply” exterior products, which allows us to 
form an “exterior algebra” that is illustrated below. 

THEOREM A.7: Let V be a vector space over K. Let r and .v be positive integers. Then there is a 

unique bilinear mapping 

A r v x AV - A r+S v 
such that, for any vectors u t , w,- in V. 

(u l A ■ ■ ■ A u r ) X (wj A ■ ■ ■ A 1 —> Mj A ■ ■ ■ A u r A w 3 A ■ ■ ■ A w s 
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EXAMPLE A.6 

We form an exterior algebra A over a field K using noncommuting variables x, y, z . Because it is an exterior algebra, 
our variables satisfy: 

iAi = 0, _yA_y = 0, zAz = 0, and yAx=—xAy, zAx = —xAz, zAy——yAz 
Every element of A is a linear combination of the eight elements 
1, x, y, z, x Ay, xAz, yAz, x Ay A z 

We multiply two “polynomials” in A using the usual distributive law, but now we also use the above conditions. For 
example, 

[3 + 4y — 5x Ay + 6x Az] A [5x — 2_y] = 15x — 6y — 20xAy + \2xAyAz 

Observe we use the fact that 

[4y] A [5x] = 20y Ax = —20x Ay and [6xA z] A [—2y\ — —llxAzAy — 12x Ay Az 



Algebraic Structures 


B.l Introduction 


We define here algebraic structures that occur in almost all branches of mathematics. In particular, we will 
define a field that appears in the definition of a vector space. We begin with the definition of a group, 
which is a relatively simple algebraic structure with only one operation and is used as a building block for 
many other algebraic systems. 


B.2 Groups 


Let G be a nonempty set with a binary operation; that is, to each pair of elements a, b £ G there is assigned 
an element ab £ G. Then G is called a group if the following axioms hold: 

[G,] For any a, b, c £ G, we have (ab)c = a(bc ) (the associative law). 

[G 2 \ There exists an element e £ G, called the identity element, such that ae — ea — a for every 
a £ G. 

[G 3 ] For each a £ G there exists an element a 1 £ G, called the inverse of a, such that 
aa~ l — a~ l a — e. 

A group G is said to be abelian (or: commutative) if the commutative law holds—that is, if ab = ba for 
every a, b £ G. 

When the binary operation is denoted by juxtaposition as above, the group G is said to be written 
multiplicatively. Sometimes, when G is abelian, the binary operation is denoted by + and G is said to be 
written additively. In such a case, the identity element is denoted by 0 and is called the zero element; the 
inverse is denoted by —a and it is called the negative of a. 

If A and B are subsets of a group G, then we write 

AB — {ab\a £ A, b £ B} or A + B — {a + b\a £ A, b £ B} 

We also write a for {a}. 

A subset H of a group G is called a subgroup of G if H forms a group under the operation of G. If H is 
a subgroup of G and a £ G, then the set Ha is called a right coset of H and the set aH is called a left coset 
of H. 

DEFINITION: A subgroup H of G is called a normal subgroup if a 1 Ha C H for every a £ G. 

Equivalently, H is normal if aH = Ha for every a £ G—that is, if the right and left 
cosets of H coincide. 

Note that every subgroup of an abelian group is normal. 

THEOREM B.l: Let H be a normal subgroup of G. Then the cosets of Gin G form a group under coset 

multiplication. This group is called the quotient group and is denoted by G/H. 
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EXAMPLE B.l The set Z of integers forms an abelian group under addition. (We remark that the even integers 
form a subgroup of Z but the odd integers do not.) Let H denote the set of multiples of 5; that is, 
//={..., —10, —5, 0, 5, 10,...}. Then H is a subgroup (necessarily normal) of Z. The cosets of H in Z follow: 


0 = 

= 0 

+ H = H = {..., 

-10, ■ 

-5. 

, 0, 5, 

10 . 

1 = 

= 1 

+ H={ 


-4, 

1 , 

6 , 

11 , .. 

•} 

2 = 

= 2 

+ H={ 

..., -8, 

-3, 

2 , 

7, 

12 , .. 

•} 

3 = 

= 3 

+ H = { 

..., -7, 

-2, 

3, 

8 , 

13, .. 

•} 

4 = 

= 4 

+ H = { 

..., -6, 

-1, 

4, 

9, 

14, .. 

•} 


For any other integer n 6 Z, n = n + H coincides with one of the above cosets. Thus, by the above theorem, 
Z /H = {0, I, 2, 3, 4} forms a group under coset addition; its addition table follows: 


+ 

0 

1 

2 

3 

4 

0 

0 

I 

2 

3 

4 

1 

r 

2 

3 

4 

0 

2 

2 

3 

4 

0 

I 

3 

3 

4 

0 

I 

2 

4 

4 

0 

1 

2 

3 


This quotient group Z IH is referred to as the integers modulo 5 and is frequently denoted by Z 5 . Analogeusly, for 
any positive integer n, there exists the quotient group Z„ called the integers modulo n. 


EXAMPLE B .2 The permutations of n symbols (see page 269) form a group under composition of mappings; it is 
called the symmetric group of degree n and is denoted by S„. We investigate S 3 here; its elements are 


e = 


cr, = 


Here 


1 2 3 
i j k 


2 3 
2 3 

2 3 

3 2 


o? = 


cr, = 


1 2 3 
3 2 1 

1 2 3 

2 1 3 


is the permutation that maps 1 


01 = 

02 = 

i, 2 i—> 7 , 3 


1 2 3 

2 3 1 

1 2 3 

3 1 2 


k. The multiplication table of S 3 is 



e 

o-i 

^2 

ff 3 

0i 

02 

e 

e 

ffl 

^2 

ff 3 

0i 

02 

o-i 

o-i 

e 

01 

02 

f7 2 

ff 3 

^2 

^2 

02 

e 

01 

03 

ffl 

ff 3 

ff3 

0i 

02 

e 

ffl 

f7 2 

01 

01 

0-3 

ffl 

tr 2 

02 

C 

02 

02 

^2 

^3 

ffi 

e 

0i 


(The element in the ath row and foth column is ab.) The set H = (e, it,} is a subgroup of S 3 ; its right and left 
cosets are 


Right Cosets 
H = {e, o x } 

H <l> l = { 01 > ff 2 } 
H = {02> ff 3> 


Left Cosets 
H = {e, cr,} 

4 > 2 H = { 01 » ff 3 ) 
4 > 2 H = {0 2 . ff 2 > 


Observe that the right cosets and the left cosets are distinct; hence, H is not a normal subgroup of S 3 . 

A mapping/from a group G into a group G' is called a homomorphism iff (ah) = f(a)f(b). For every 
a, b £ G. (If/is also bijective, i.e., one-to-one and onto, then/ is called an isomorphism and G and G' are 
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said to be isomorphic.) If / : G — > G' is a homomorphism, then the kernel of/is the set of elements of G 
that map into the identity element e' £ G'\ 

kernel of/ = {a £ G |/(a) = e'} 

(As usual,/(G) is called the image of the mapping / : G —» G'.) The following theorem applies. 

THEOREM B.2: Let/: G —> G be a homomorphism with kernel K. Then K is a normal subgroup of G, 
and the quotient group G/K is isomorphic to the image of / 

EXAMPLE B.3 Let G be the group of real numbers under addition, and let G' be the group of positive real numbers 
under multiplication. The mapping / : G —> G' defined by/(a) = 2 a is a homomorphism because 

f(a + b) = 2 a+b = 2 a 2 b =f(a)f(b ) 

In particular, / is bijective, hence, G and G' are isomorphic. 

EXAMPLE B.4 Let G be the group of nonzero complex numbers under multiplication, and let G' be the group of 
nonzero real numbers under multiplication. The mapping / : G —> G' defined by f(z) = |z| is a homomorphism 
because 

f{ZiZ 2 ) = \ZiZ 2 \ = kil|z 2 | =f(zi)f(z 2 ) 

The kernel K of/consists of those complex numbers z on the unit circle—that is, for which |z| = 1. Thus, G/K is 
isomorphic to the image of /—that is, to the group of positive real numbers under multiplication. 


B.3 Rings, Integral Domains, and Fields 


Let R be a nonempty set with two binary operations, an operation of addition (denoted by +) and an 
operation of multiplication (denoted by juxtaposition). Then R is called a ring if the following axioms are 
satisfied: 

[/?,] For any a, b, c £ R, we have (a + b) + c = a + (b + c). 

[/? 2 ] There exists an element 0 £ R. called the zero element, such that a + 0 = 0 -|- a — a for every 
a £ R. 

[/? 3 ] For each a £ R there exists an element —a £ R, called the negative of a, such that 
a + (—a) = (—a) + a = 0. 

[R 4 ] For any a, b £ R, we have a + b = b + a. 

[/? 5 ] For any a, b, c £ R , we have ( ab)c = a(bc). 

[/? 6 ] For any a, b, c £ R, we have 

(i) a{b + c) = ab + ac, and (ii) (b + c)a = ba + ca. 

Observe that the axioms [R/ through [R 4 ] may be summarized by saying that R is an abelian group 
under addition. 

Subtraction is defined in by a — b = a + (— b ). 

It can be shown (see Problem B.25) that a ■ 0 = 0 ■ a = 0 for every a £ R. 

R is called a commutative ring if ab = ba for every a. b £ R. We also say that R is a ring with a unit 

element if there exists a nonzero element 1 £ R such that a ■ 1 = 1 ■ a = a for every a £ R. 

A nonempty subset S of R is called a subring of R if S forms a ring under the operations of R. We note 

that S is a subring of R if and only if a, b £ S implies a - b £ S and ab £ S. 

A nonempty subset / of R is called a left ideal in R if (i) a — b £ / whenever a. h £ /. and (ii) ra £ / 
whenever r £ R, a £ I. Note that a left ideal I in R is also a subring of R. Similarly, we can define a right 
ideal and a two-sided ideal. Clearly all ideals in commutative rings are two sided. The term ideal shall 
mean two-sided ideal uniess otherwise specified. 
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THEOREM B.3: Let / be a (two-sided) ideal in a ring R. Then the cosets {a +1 \ a £ R} form a ring 
under coset addition and coset multiplication. This ring is denoted by R/I and is 
called the quotient ring. 

Now let R be a commutative ring with a unit element. For any a € R, the set (a) = {ra \ r G R} is an 
ideal; it is called the principal ideal generated by a. If every ideal in R is a principal ideal, then R is called a 
principal ideal ring. 

DEFINITION: A commutative ring R with a unit element is called an integral domain if R has no zero 

divisors —that is, if ah = 0 implies a = 0 or b = 0 . 

DEFINITION: A commutative ring R with a unit element is called afield if' every nonzero a £ R has a 

multiplicative inverse ; that is, there exists an element a 1 £ /i such that aa 1 — a~ l a = 1 . 

A field is necessarily an integral domain; for if ab = 0 and a / 0. then 

b — 1 ■ b — aT l ab = cT 1 -0 = 0 

We remark that a field may also be viewed as a commutative ring in which the nonzero elements form a 
group under multiplication. 

EXAMPLE B.5 The set Z of integers with the usual operations of addition and multiplication is the classical 
example of an integral domain with a unit element. Every ideal 1 in Z is a principal ideal; that is, 7 = (n) for 
some integer n. The quotient ring Z„ = Z/(«) is called the ring of integers module n. If n is prime, then Z„ is a field. 
On the other hand, if n is not prime then Z„ has zero divisors. For example, in the ring Z 6 , 23 = 0and 
2/0 and 3/0. 

EXAMPLE B.6 The rational numbers Q and the real numbers R each form a field with respect to the usual operations 
of addition and multiplication. 

EXAMPLE B.7 Let C denote the set of ordered pairs of real numbers with addition and multiplication defined by 
(a, b) + (c, d) — (a ■ c. b yd) 

(a, b) ■ ( c , d) = (ac — bd, ad + be) 

Then C satisfies all the required properties of a field. In fact, C is just the field of complex numbers (see page 4). 

EXAMPLE B.8 The set M of all 2 x 2 matrices with real entries forms a noncommutative ring with zero divisors 
under the operations of matrix addition and matrix multiplication. 

EXAMPLE B.9 Let R be any ring. Then the set 7 ?[jc] of all polynomials over R forms a ring with respect to the usual 
operations of addition and multiplication of polynomials. Moreover, if R is an integral domain then R[x\ is also an 
integral domain. 

Now let D be an integral domain. We say that b divides a in I) if a = be for some c £ 77. An element 
u £ D is called a unit if u divides 1—that is, if it has a multiplicative inverse. An element b G 77 is called an 
associate of a £ D if b = ua for some unit u G I). A nonunit p G l) is said to be irreducible if p = ab 
implies a or b is a unit. 

An integral domain D is called a unique factorization domain if every nonunit a £ D can be written 
uniquely (up to associates and order) as a product of irreducible elements. 

EXAMPLE B.10 The ring Z of integers is the classical example of a unique factorization domain. The units of Z 
are 1 and —1. The only associates of n 6 Z are n and — n. The irreducible elements of Z are the prime numbers. 

EXAMPLE B.ll The set D = {n + fe\/l3 | a, b integers} is an integral domain. The units of D are ±1, 
18 ± 5\/l3 and — 18 ± 5\/l3- The elements 2, 3 — x/13 and —3 — \/l3 are irreducible in D. Observe that 
4 = 2-2= (3 — \/l3) (—3 — \/T3). Thus, D is not a unique factorization domain. (See Problem B.40.) 
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B.4 Modules 


Let M be an additive abelian group and let K be a ring with a unit element. Then M is said to be a (left) 
R-module if there exists a mapping R x M —> M that satisfies the following axioms: 

[Mj] r(m l + m 2 ) = rm l + rm 2 
[M 2 ] (/' + s)m = rm + sm 
[M 3 ] ( rs)m = r(sm ) 

[Af 4 ] 1 ■ m — m 

for any r, s £ R and any m t £ M. 

We emphasize that an A-modulc is a generalization of a vector space where we allow the scalars to 
come from a ring rather than a field. 

EXAMPLE B.12 Let G be any additive abelian group. We make G into a module over the ring Z of integers by 
defining 

n times 

✓s, 

ng - g fi s --bg, 0g = 0, {-n)g = -ng 

where n is any positive integer. 

EXAMPLE B.13 Let R be a ring and let / be an ideal in R. Then I may be viewed as a module over R. 

EXAMPLE B.14 Let V be a vector space over a field K and let T : V V be a linear mapping. We make V into a 
module over the ring AT[ x] of polynomials over K by defining f(x)v =f(T)(v). The reader should check that a scalar 
multiplication has been defined. 

Let M be a module over R. An additive subgroup N of M is called a submodule of M if u £ N and k £ R 
imply hi £ N. (Note that N is then a module over R.) 

Let M and M' be A'-modules. A mapping T:M ^ M 1 is called a homomorphism (or: R-homomorphism 
or R-linear ) if 

(i) T(u + v) — T(u) + T(v) and (ii) T(ku) = kT(u) 
for every u, v £ M and every k £ R. 


PROBLEMS 


Groups 

B.l. Determine whether each of the following systems forms a group G: 

(i) G = set of integers, operation subtraction; 

(ii) G = {1, — 1}, operation multiplication; 

(iii) G = set of nonzero rational numbers, operation division; 

(iv) G = set of nonsingular ?! x n matrices, operation matrix multiplication; 

(v) G = {a + hi: a, b £ Z}, operation addition. 

B.2. Show that in a group G: 

(i) the identity element of G is unique; 

(ii) each a £ G has a unique inverse a -1 £ G; 

(iii) (a -1 ) 1= a, and ( ab)~ l = 

(iv) ab = ac implies b = c, and ba = ca implies b = c. 
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B.3. In a group G, the powers of a € G are defined by 

a 0 = e , a" = aa" -1 , a~ n = (a") -1 , where n G N 

Show that the following formulas hold for any integers r, s, t 6 Z: (i) a r a s = a r+s , (ii) ( a r ) s = a rs , 
(iii) (a r+s Y= a rs+sl . 

B.4. Show that if G is an abelian group, then ( ab)"= a n b" for any a, b € G and any integer n £ Z. 

B.5. Suppose G is a group such that ( ab) 2 = a 2 b 2 for every a, b G G. Show that G is abelian. 

B.6. Suppose H is a subset of a group G. Show that H is a subgroup of G if and only if (i) H is 
nonempty, and (ii) a, b £ H implies ah 1 € H. 

B.7. Prove that the intersection of any number of subgroups of G is also a subgroup of G. 

B.8. Show that the set of all powers of a G G is a subgroup of G; it is called the cyclic group generated 
by a. 

B.9. A group G is said to be cyclic if G is generated by some a € G; that is, G = (a" : n € Z). Show 
that every subgroup of a cyclic group is cyclic. 

B.10. Suppose G is a cyclic subgroup. Show that G is isomorphic to the set Z of integers under addition 
or to the set Z„ (of the integers module n) under addition. 

B.ll. Let H be a subgroup of G. Show that the right (left) cosets of H partition G into mutually disjoint 
subsets. 

B.12. The order of a group G, denoted by |G|, is the number of elements of G. Prove Lagrange’s 
theorem: If H is a subgroup of a finite group G, then \H\ divides |G|. 

B.13. Suppose \G\ = p where p is prime. Show that G is cyclic. 

B.14. Suppose H and N are subgroups of G with N normal. Show that (i) HN is a subgroup of G and 
(ii) H Hi N i s a normal subgroup of G. 

B.15. Let H be a subgroup of G with only two right (left) cosets. Show that H is a normal subgroup of G. 

B.16. Prove Theorem B.l: Let H be a normal subgroup of G. Then the cosets of H in G form a group 
G/H under coset multiplication. 

B.17. Suppose G is an abelian group. Show that any factor group G/H is also abelian. 

B.18. Let / : G -> G' be a group homomorphism. Show that 

(i) f(e) = e' where e and d are the identity elements of G and G', respectively; 

(ii) /(a -1 ) =/(a ) _1 for any a G G. 

B.19. Prove Theorem B.2: Let/ : G —► G' be a group homomorphism with kernel K. Then K is a normal 
subgroup of G, and the quotient group G/K is isomorphic to the image off. 

B.20. Let G be the multiplicative group of complex numbers z such that |z| = 1, and let R be the additive 
group of real numbers. Prove that G is isomorphic to R/Z. 
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B.21. For a fixed g £ G, let g : G —» G be defined by g(a ) = g l ag. Show that G is an isomorphism of 
G onto G. 

B.22. Let G be the multiplicative group of n x n nonsingular matrices over R. Show that the mapping 
A i—* |A| is a homomorphism of G into the multiplicative group of nonzero real numbers. 

B.23. Let G be an abelian group. For a fixed n € Z, show that the map a i—> a" is a homomorphism of 
G into G. 

B.24. Suppose H and N are subgroups of G with N normal. Prove that H Hi N is normal in H and 
H/(HnN) is isomorphic to HN/N. 

Rings 

B.25. Show that in a ring R: 

(i) a ■ 0 = 0 ■ a = 0, (ii) a(—b) = {—a)b = —ab, (iii) (—«)(—£>) = ab. 

B.26. Show that in a ring R with a unit element: (i) ( — \)a = —a, (ii) ( —1)(—1) = 1. 

B.27. Let R be a ring. Suppose a 2 = a for every a G R. Prove that R is a commutative ring. (Such a ring 
is called a Boolean ring.) 

B.28. Let R be a ring with a unit element. We make R into another ring R by defining a®b = a + b+ 1 
and a ■ b = ab + a + b. (i) Verify that R is a ring, (ii) Determine the 0-element and 1-element of R. 

B.29. Let G be any (additive) abelian group. Define a multiplication in G by a ■ b = 0. Show that this 
makes G into a ring. 

B.30. Prove Theorem B.3: Let / be a (two-sided) ideal in a ring R. Then the cosets (a +1 \ a E R) form a 
ring under coset addition and coset multiplication. 

B.31. Let /, and I 2 be ideals in R. Prove that /, + / 2 and /, n / 2 are also ideals in R. 

B.32. Let R and R' be rings. A mapping/ : R —»• R' is called a homomorphism (or: ring homomorphism) if 
(i) f(a + b) =f{a) +f{b) and (ii) f(ab) =f(a)f(b), 

for every a, b £ R. Prove that if / : R —> R' is a homomorphism, then the set K = {r 6 R \f(r) = 0} is an 
ideal in R. (The set K is called the kernel off.) 

Integral Domains and Fields 

B.33. Prove that in an integral domain D, if ab = ac, a / 0, then b = c. 

B.34. Prove that F — {« + b\J 2 | a , b rational} is a field. 

B.35. Prove that D = {a + by/2 \ a , b integers} is an integral domain but not a field. 

B.36. Prove that a finite integral domain D is a field. 

B.37. Show that the only ideals in a field K are {0} and K. 

B.38. A complex number a + bi where a. b are integers is called a Gaussian integer. Show that the set G 

of Gaussian integers is an integral domain. Also show that the units in G are ±1 and ±/. 
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B.39. Let D be an integral domain and let / be an ideal in D. Prove that the factor ring D/I is an integral 
domain if and only if / is a prime ideal. (An ideal I is prime if ab £ I implies a £ / or b £ I.) 

B.40. Consider the integral domain D = {a + b\/l3 \ a, ^integers} (see Example B.ll). If 
a = a + b\f\3, we define N{ a) = a 2 — 13 b 2 . Prove: (i) A(a/1) = /V(a)iV(/?); (ii) a is a unit if 
and only if N(y.) — ±1; (iii) the units of D are ±1, 18 ± 5v/l3 and — 18 ± 5i/l3; (iv) the numbers 
2,3— yi3 and — 3 — s/Y5 are irreducible. 

Modules 

B.41. Let M be an /^-module and let A and B be submodules of M. Show that A + B and A n B arc also 
submodules of M. 

B.42. Let M be an /^-module with submodule N. Show that the cosets {it + N : u £ M} form an 
fCmodule under coset addition and scalar multiplication defined by r(u + N) = ru + N. (This 
module is denoted by M/N and is called the quotient module.) 

B.43. Let M and M' be A’-modulcs and let / : M —> M' be an /^-homomorphism. Show that the set 
K — {u £ M : /(«) — 0} is a submodule off. (The set K is called the kernel off.) 

B.44. Let M be an /£module and let E(M) denote the set of all /^-homomorphism of M into itself. Define 
the appropriate operations of addition and multiplication in E(M) so that E(M) becomes a ring. 




Polynomials over a Field 


C.1 Introduction 


We will investigate polynomials over a field K and show that they have many properties that are analogous 
to properties of the integers. These results play an important role in obtaining canonical forms for a linear 
operator T on a vector space V over K. 


C.2 Ring of Polynomials 


Let Kbe a field. Formally, a polynomial of/over K is an infinite sequence of elements from K in which all 
except a finite number of them are 0 : 

/=(•■•, 0, a n , ..., a u a 0 ) 

(We write the sequence so that it extends to the left instead of to the right.) The entry a k is called the /cth 
coefficient of /. If n is the largest integer for which a n f 0, then we say that the degree of / is n, written 

deg / = n 

We also call a„ the leading coefficient of f and if a n = 1 we call / a monic polynomial. On the other hand, 
if every coefficient of / is 0 then / is called the zero polynomial, written f = 0. The degree of the zero 
polynomial is not defined. 

Now if g is another polynomial over K, say 

g= (•■•, 0 , b m , ...,b u b 0 ) 

then the sum f + g is the polynomial obtained by adding corresponding coefficients. That is, if m < n, then 
/ + g = (•■■, 0 , a„, a m + b m . a t +b u a 0 + b 0 ) 

Furthermore, the product fg is the polynomial 
fg =(•••, 0 , a n b m , ..., affi o + a Q b u a 0 b 0 ) 
that is, the Artli coefficient c k of fg is 
k 

= a ih-i = a oh + a i h k-\ + ■ ■ • + a k b 0 
t =o 

The following theorem applies. 


THEOREM C.l: The set P of polynomials over a field K under the above operations of addition and 
multiplication forms a commutative ring with a unit element and with no zero 
divisors—an integral domain. If / and g are nonzero polynomials in P, then 
deg (fg) = (deg/)(deg g). 
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Notation 

We identify the scalar a 0 € K with the polynomial 

a 0 = (..., 0, a 0 ) 

We also choose a symbol, say f, to denote the polynomial 
*=(...,0,1,0) 

We call the symbol t an indeterminant. Multiplying t with itself, we obtain 
f 2 = (..., 0, 1,0, 0), t 3 = (..., 0, 1, 0, 0, 0), ... 

Thus, the above polynomial / can be written uniquely in the usual form 
/ = a n t" + ■ ■ ■ + a s t + Oq 

When the symbol t is selected as the indeterminant, the ring of polynomials over K is denoted by 
K[t\ 

and a polynomial / is frequently denoted by/(f). 

We also view the field K as a subset of K[t] under the above identification. This is possible because the 
operations of addition and multiplication of elements of K are preserved under this identification: 

(..., 0 , a 0 ) + (..., 0 , b 0 ) = (..., 0 , a 0 + b 0 ) 

(• • •, 0, a 0 ) ■ (..., 0, b 0 ) = (..., 0, a 0 b 0 ) 

We remark that the nonzero elements of K are the units of the ring K[t). 

We also remark that every nonzero polynomial is an associate of a unique monic polynomial. Hence, if 
d and d' are monic polynomials for which d divides d' and d' divides d, then d — d'. (A polynomial g 
divides a polynomial / if there is a polynomial h such that / = hg.) 


C.3 Divisibility _ 

The following theorem formalizes the process known as “long division.” 

THEOREM C.2 (Division Algorithm): Let/and g be polynomials over a field K with g / 0. Then 
there exist polynomials q and r such that 

f = qg + r 

where either r — 0 or deg r < deg g. 

Proof. If / = 0 or if deg / < deg g, then we have the required representation 

f = 0g+f 

Now suppose deg / > deg g, say 

/ = a„f 4-+ a it + a 0 and g = b m t’ n H- \-b { t + b Q 

where a n , b m / 0 and n > m. We form the polynomial 

(1) 

K 

Then deg/ < deg/. By induction, there exist polynomials q x and r such that 
fi = dig + r 
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where either r = 0 or deg r < deg g. Substituting this into (1) and solving for / 

f ={ q ' +a tr) s+r 

which is the desired representation. 

THEOREM C.3: The ring K[t\ of polynomials over a field K is a principal ideal ring. If I is an ideal in 
K[t\, then there exists a unique monic polynomial d that generates I, such that d 
divides every polynomial / £ 7. 

Proof. Let d be a polynomial of lowest degree in I. Because we can multiply d by a nonzero scalar and 
still remain in 7, we can assume without loss in generality that d is a monic polynomial. Now suppose 
/ £ I. By Theorem C.2 there exist polynomials q and r such that 

f = qd + r where either r = 0 or deg r < deg d 

Now /, d £ I implies qd £ 7, and hence, r =f — qd £ 7. But d is a polynomial of lowest degree in 7. 
Accordingly, r = 0 and/ = qd: that is, d divides/. It remains to show that d is unique. If d’ is another 
monic polynomial that generates 7, then d divides d’ and d' divides d. This implies that d = d', because d 
and d' are monic. Thus, the theorem is proved. 

THEOREM C.4: Let / and g be nonzero polynomials in K[t], Then there exists a unique monic 
polynomial d such that 

(i) d divides / and g; and (ii) d' divides / and g. then d' divides d. 

DEFINITION: The above polynomial d is called the greatest common divisor of/ and g. If d = 1, then 

/ and g are said to be relatively prime. 

Proof of Theorem C.4. The set 7 = { mf + ng \ m, n £ K[t\ } is an ideal. Let d be the monic polynomial 
that generates 7. Note/, g £ 7; hence, d divides /and g. Now suppose d' divides / and g. Let J be the ideal 
generated by d'. Then/, g £ J, and hence, 7 C J. Accordingly, d £ J and so d' divides d as claimed. It 
remains to show that d is unique. If / is another (monic) greatest common divisor of / and g, then d 
divides d { and d x divides d. This implies that d = d i because d and d\ are monic. Thus, the theorem is 
proved. 

COROLLARY C.5: Let d be the greatest common divisor of the polynomials/and g. Then there exist 
polynomials m and n such that d = mf + ng. In particular, if/and g are relatively 
prime, then there exist polynomials m and n such that mf + ng = 1 . 

The corollary follows directly from the fact that d generates the ideal 
7 = {mf + ng | m, n £ A[f]} 


C.4 Factorization 


A polynomial p £ K[t] of positive degree is said to be irreducible if p =fg implies /or g is a scalar. 

LEMMA C. 6 : Suppose p £ K[t] is irreducible. If p divides the product fg of polynomials /. g £ K[t], 
then p divides/or p divides g. More generally, if p divides the product of n polynomials 
f\fi ■ ■ - fiv then p divides one of them. 


Proof. Suppose p divides fg but not /. Because p is irreducible, the polynomials / and p must then be 
relatively prime. Thus, there exist polynomials m, n £ K[t] such that mf + np = 1. Multiplying this 
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equation by g, we obtain mfg + npg = g. But p divides/? and so mfg, and p divides npg', hence, p divides 
the sum g = mfg + npg. 

Now suppose p divides f\f 2 ■ ■ ■f n . If p divides/,. then we are through. If not, then by the above result p 
divides the product f 2 ■ ■ ■/„• By induction on n, p divides one of the polynomials f 2 , •••/,• Thus, the 
lemma is proved. 

THEOREM C.7: (Unique Factorization Theorem) Let/'be a nonzero polynomial in K[t \. Then/can be 
written uniquely (except for order) as a product 

/ = k P\Pl " Pn 

where k £ K and the p, are monic irreducible polynomials in K[t\. 

Proof. We prove the existence of such a product first. If /is irreducible or if / £ K, then such a product 
clearly exists. On the other hand, suppose/ = gh where/and g are nonscalars. Then g and h have degrees 
less than that of / By induction, we can assume 

g = k \g\g 2 ■■■ g r and h = k 2 hfi 2 ■ ■ ■ h s 

where k x , k 2 € K and the g t and Ip are monic irreducible polynomials. Accordingly, 

f = { k \ k 2)glg2'"gr k l h 2'" h s 

is our desired representation. 

We next prove uniqueness (except for order) of such a product for / Suppose 
f = k PlP2--Pn = k '<h<l2---<lm 

where k. k 1 £ K and the /?,. ..., p N . q L ..., q m are monic irreducible polynomials. Now p x divides 
kl( h ''' c lm- Because /?, is irreducible, it must divide one of the cp by the above lemma. Say /;, divides q x . 
Because p x and q x are both irreducible and monic, p x = q x . Accordingly, 

k P2 Pn ~ k>c l2 ' ' ' Qm 

By induction, we have that n = m and p 2 = q 2 , ..., p n = q m for some rearrangement of the q r We also 
have that k = k'. Thus, the theorem is proved. 

If the field K is the complex field C, then we have the following result that is known as the fundamental 
theorem of algebra; its proof lies beyond the scope of this text. 


THEOREM C. 8 : (Fundamental Theorem of Algebra) Let f{t) be a nonzero polynomial over the 
complex field C. Then f(t) can be written uniquely (except for order) as a product 

/0) = k (t ~ r 2 )(t - r 2 ) ■ ■ ■ (t - r n ) 

where k, r ( - £ C —as a product of linear polynomials. 

In the case of the real field R we have the following result. 

THEOREM C.9: Let f(t) be a nonzero polynomial over the real field R. Then fit) can be written 
uniquely (except for order) as a product 

fif) = k Pi(t)p 2 (t) ■ ■ ■ p m {t) 


where k € R and the pft) are monic irreducible polynomials of degree one or two. 
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D.T Introduction 


This appendix discusses various topics, such as equivalence relations and singular value decomposition. 

D.2 Relations and Equivalence Relations 

A binary relation or simply relation R from a set A to a set B assigns to each ordered pair (a, b) € A x B 
exactly one of the following statements: 

(i) “ a is related to b” written a R b, (ii) “a is not related to b” written a J{ b. 

A relation from a set A to the same set A is called a relation on A. 

Observe that any relation R from A to B uniquely defines a subset R of A x B as follows: 

R = {( a,b)\aRb} 

Conversely, any subset R of A x B defines a relation from A to B as follows: 
a R b if and only if (a, b) € R 

In view of the above correspondence between relations from A to B and subsets of Ax B, we redefine a 
relation from A to B as follows: 

DEFINITION D.l: A relation R from A to B is a subset of Ax B. 

Equivalence Relations 

Consider a nonempty set S. A relation R on S is called an equivalence relation if R is reflexive, symmetric, 
and transitive; that is, if R satisfied the following three axioms: 

[Ei] (Reflexivity) Every a e A is related to itself. That is, for every a 6 A, a R a. 

[E 2 ] (Symmetry) If a is related to b, then b is related to a. That is, if a R b, then b R a. 

[E 3 ] (Transitivity) If a is related to b and b is related to c, then a is related to c. That is, 

if a R b and b R c, then a R c. 

The general idea behind an equivalence relation is that it is a classification of objects that are in some way 
“alike.” Clearly, the relation of equality is an equivalence relation. For this reason, one frequently uses ~ 
or = to denote an equivalence relation. 

EXAMPLE D.l 

(a) In Euclidean geometry, similarity of triangles is an equivalence relation. Specifically, suppose a,/?,y 
are triangles. Then (i) a is similar to itself, (ii). If a is similar to [i, then /! is similar to oc. (iii) If a is 
similar to ft and ft is similar to y, then <x is similar to y. 

(b) The relation C of set inclusion is not an equivalence relation. It is reflexive and transitive, but it is not 
symmetric because A C B does not imply B CA. 
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Equivalence Relations and Partitions 

Let S be a nonempty set. Recall first that a partition P of S is a subdivision of S into nonempty, 
nonoverlapping subsets; that is, a collection P — {A ; } of nonempty subsets of S such that (i) Each a £ S 
belong to one of the A-, (ii) The sets {A/} are mutually disjoint. 

The subsets in a partition P are called cells. Thus, each a G .S' belongs to exactly one of the cells. Also, 
any element b £ A ? is called a representative of the cell A-, and a subset B of S is called a system of 
representatives if B contains exactly one element in each of the cells in {A ; }. 

Now suppose R is an equivalence relation on the nonempty set S. For each a £ .S', the equivalence class 
of a, denoted by [a], is the set of elements of S to which a is related: 

[n] = {x | a Rx}. 

The collection of equivalence classes, denoted by S/R, is called the quotient of S by R: 

S/R = {[a] | a £ S} 

The fundamental property of an equivalence relation and its quotient set is contained in the following 
theorem: 

THEOREM D.l: Let R be an equivalence relation on a nonempty set S. Then the quotient set S/R is a 
partition of S. 

EXAMPLE D.2 Let = be the relation on the set Z of integers defined by x = y(mod 5) which reads “x is 
congruent to y modulus 5” and which means that the difference x — y is divisible by 5. Then = is an 
equivalence relation on Z. 

Then there are exactly five equivalence classes in the quotient set Z/ = as follows: 


^0 


-10 

-5, 0, 5, 10, 


Ai 


-9, 

—4, 1, 6 , 11, . 


^2 


- 8 , 

-3, 2, 7, 12, . 


a 3 


-7, 

-2, 3, 8 , 13, . 


a 4 


- 6 , 

-1,4,9, 14, . 



Note that any integer x, which can be expressed uniquely in the form x = 5q + r where 0 < r < 5, is a 
member of the equivalence class A r where r is the remainder. As expected, the equivalence classes are 
disjoint and their union is Z: 

Z = A 0 U Aj UA 2 UA 3 UA 4 

This quotient set Z/ =, called the integers modulo 5, is denoted 
Z/5Z or simply Z 5 . 

Usually one chooses {0, 1, 2, 3, 4} or {—2, —1, 0, 1, 2} as a system of representatives of the equiva¬ 
lence classes. 

Analagously, for any positive integer m, there exists the congruence relation = defined by 
x = y(mod m) 

and the quotient set Z/ =, denoted by Z/mZ or simply Z m , is called the integers modulo m. 


D.3 Properties of AA T and A T A 

Let A be any m x n matrix. Then A T A and AA T are both symmetric since 
(A t A) t = A t A tt = A t A and iAA T ) T = A TT A T = AA T 
One can also show that A T A and AA T have the same nonzero eigenvalues. 
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Recall [Theorem 9.14] that a symmetric matrix M is orthogonally diagonalizable, that is, there exists an 
orthogonal matrix P and diagonal matrix D = [dj| such that 

p- 'mp = d 

where the columns of P are the eigenvectors of M and the d ; are the eigenvalues of M. 

In particular, the symmetric matrix M is called: 

(i) positive definite if, for every nonzero u, <u, Mu> = u T Mu>0 

(ii) positive semidefinite if, for every nonzero u, <u, Mu> = u T Mu > 0] 

In such a case, the diagonal elements in P 'MP = D are: (i) positive, (ii) nonnegative. 

A t A is positive definite if A has full column rank since An fi 0 when u fi 0; so 

<u, A t Au> = u t A t Au = <Au, Au> > 0 

Similarly, AA T is positive definite if A has full row rank, since A T u 0 when u fi 0; so 
<u, AA t u> =u t AA t u = u t A tt A t = <A t u, A t u> > 0 
On the other hand, A T A and AA T are always both positive semidefinite. 

D.4 Singular Value Decomposition 


Let A be any m x n matrix of rank r. Then there exists a factorization of A of the form 

a=uiv t 

where U is an m-square orthogonal matrix, V is an n-square orthogonal matrix, and X = [ct^] is an m x n 
generalized diagonal matrix, that is, cry = 0 for i^j. Such a factorization is called a singular value 
decomposition (SVD) of A, and the diagonal entries o’; = <r u of X, usually listed in descending order, are 
called the singular values of A. 


THEOREM D.2: Every matrix A with rank r>0 has a singular value decomposition. 

We indicate how the entries U, V and X in the SVD of A are obtained. 

Recall AA t is symmetric and positive definite (or positive semidefinite). Accordingly, AA T is 
orthogonally diagonalizable, that is, AA T = P ” 'DP = P t DP where the columns of the orthogonal 
matrix P are eigenvectors of AA T , and the entries of D are the eigenvalues of AA T and they are 
nonnegative. 

Assuming A = U X V T , we have 


A A t = U X V T (U X V T ) T = U X V T V X U T = U X 2 U T 

Thus the columns of U in the SVD of A are the normalized eigenvectors of AA T and the entries (7; in X are 
the square roots of the eigenvalues of AA T . 

Similarly, assuming A = U X V T , we have 

A t A = (U X V T ) T U X V T = V XU T U X v T = VX 2 V T 


Thus the columns of V and the rows of V T are the normalized eigenvectors of A T A and the entries rr, in X 
are the square roots of the eigenvalues of AA T (which are the same as the eigenvalues of A T A). 


EXAMPLE D.3. We find the singular value decomposition (SVD) of A = 


4 4 
-3 3 


. Note 


AA t = 


4 

4 

'4 

-3' 


'32 

O' 

-3 

3 

4 

3 


0 

18 
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The eigenvalues are 32 and 18 with corresponding eigenvectors [1, 0] T and [0,1] T . Thus 


"1 o' 

and E = 

"\/32 0 

= V2 

'4 O' 

0 1 


o 

< 

00 


0 3 


Also 


'4 -3' 

4 4' 


'25 7' 

_4 3 _ 

-3 3 _ 


7 25 


Hence A(t) = t 2 — 50t + 576 = (t — 32)(t — 18). Thus, again, the eigenvalues are 32 and 18. 

j _ r j' 

The normalized corresponding eigenvectors are [l/v/2, l/v/2] and -1 / \/2. I / \/2] .Thus 


l/y/2 1/V5' 

1 

i r 

_ — 1 / v/2 l/V2_ 


-i i 


As expected, 


UXV T 


1 o' 

V2 

'4 O' 

1 

1 

r 


4 4' 

0 1 

0 3 

V2 

-1 

i 


-3 3 




A 

Absolute value (complex), 12 
Abelian group, 405 
Adjoint, classical, 273 
operator, 379, 386 
Algebraic multiplicity, 300 
Alternating mappings, 278, 362, 401 
Angle between vectors, 6, 232 
Annihilator, 332, 353, 356 
Associate, 408 

Associated homogeneous system, 83 
Associative, 176, 405 
Augmented matrix, 59 

B 

Back-substitution, 63, 65, 67 
Basis, 82, 124, 142 
change of, 201, 213 
dual, 352, 354 
orthogonal, 245 
orthonormal, 245 
second dual, 369 
standard, 125 
usual, 125 

Basis-finding algorithm, 127 
Bessel inequality, 254 
Bijective mapping, 168 
Bilinear form, 361, 398 
alternating, 278 
matrix representation of, 362 
polar form of, 365 
real symmetric, 365 
symmetric, 363 
Bilinear mapping, 361, 398 
Block matrix, 39, 50 
determinants, 419 
Jordan, 345 
square, 40 
Bounded, 158 

C 

Cancellation law, 113 
Canonical forms, 207, 327 
Jordan, 331, 338 
rational, 333 
row, 74 
triangular, 327 
Casting-out algorithm, 128 
Cauchy-Schwarz inequality, 5, 231, 242 


Cayley-Hamilton theorem, 296, 310 
Cells, 39, 417 
Change of basis, 201, 213 
Change-of-basis (transition) matrix, 201 
Change-of-coordinate matrix, 223 
Characteristic polynomial, 296, 307 
value, 298 

Classical adjoint, 273 
Coefficient, 57, 58, 413 
Fourier, 235, 246 
matrix, 59 
Cofactor, 271 
Column, 27 
matrix, 27 
operations, 89 
space, 120 
vector, 3 

Colsp(A), column space, 126 
Commutative law, 405 
group, 113 

Commuting (diagram), 398 
Companion matrix, 306 
Complement, orthogonal, 244 
Complementary minor, 275 
Completing the square, 395 
Complex: 
conjugate, 13 
inner product, 241 
matrix, 38, 49 
;?-space, 13 
numbers, 1, 11, 13 
plane, 12 
Complexity, 88 
Components, 2 

Composition of mappings, 167 
Congruent matrices, 362 
diagonalization, 61 
Conjugate: 
complex, 12 
linearity, 241 
matrix, 38 
symmetric, 241 
Consistent system, 59 
Constant term, 57, 58 
Convex set, 195 
Coordinates, 2, 130 
vector, 130 
Coset, 184, 334, 405 
Cramer’s rule, 274 
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Index 


Cross product, 10 
Curves, 8 

Cyclic subspaces, 332, 344 
group, 410 

D 

<5 ; ,, Kronecker delta function, 33 

v 

Decomposable, 329 
Decomposition: 
direct-sum, 129 
primary, 240 
Degenerate, 362 
bilinear form, 362 
linear equations, 59 
Dependence, linear, 133 
Derivative, 170 
Determinant, 63, 265, 269 
computation of, 66, 272 
linear operator, 277 
order, 3, 268 
Diagonal, 32 
blocks, 40 
matrix, 35, 47 
quadratic form, 304 
Diagonal (of a matrix), 10 
Diagonalizable, 205, 294, 298 
Diagonalization: 
algorithm, 301 
in inner product space, 384 
Dimension of solution spaces, 82 
Dimension of vector spaces, 82, 141 
finite, 124 
infinite, 124 
subspaces, 126 
Direct sum, 129, 329 
decomposition, 329 
Directed line segment, 7 
Distance, 5, 243 
Divides, 414 
Division algorithm, 414 
Domain, 166, 408 
Dot product, 4 
Dual: 

basis, 352, 354 
space, 351, 354 

E 

Echelon: 
form, 65, 72 
matrices, 70 
Eigenline, 298 
Eigenspace, 301 
Eigenvalue, 298, 300, 314 
Eigenvector, 298, 300, 314 
Elementary divisors, 333 
Elementary matrix, 84 
Elementary operations, 61 
column, 86 
row, 72, 120 


Elimination, Gaussian, 67 
Empty set, 0, 112 
Equal: 

functions, 166 
matrices, 27 
vectors, 2 

Equations (See Linear equations) 
Equivalence: 
classes, 418 
matrix, 87 
relation, 73, 417 
row, 72 

Equivalent systems, 61 
Euclidean n-space, 5, 230 
Exterior product, 403 

F 

Field of scalars, 11, 408 
Finite dimension, 124 
Form: 

bilinear, 361 
linear, 351 
quadratic, 365 

Forward elimination 63, 67, 73 
Fourier coefficient, 81, 235 
series, 235 
Free variable, 65, 66 
Full rank, 41 
factorization, 419 
Function, 156 
space F(X), 114 
Functional, linear, 351 
Fundamental Theorem of Algebra, 416 

G 

Gaussian elimination, 61, 67, 73 
Gaussian integer, 411 
Gauss-Jordan algorithm, 74 
General solution, 58 
Geometric multiplicity, 300 
Gram-Schmidt orthogonalization, 237 
Graph, 166 

Greatest common divisor, 415 
Group, 113, 405 

H 

Hermitian: 
form, 366 
matrix, 38, 49 
quadratic form, 366 
Hilbert space, 331 
Homogeneous system, 58, 81 
Homomorphism, 173, 406, 409 
Hom(V r , U), 175 
Hyperplane, 7, 360 

I 

i, imaginary, 12 
Ideal, 407 
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Identity: 

mapping, 168, 170 
matrix, 33 
ijk notation, 9 
Image, 166, 171, 172 
Imaginary part, 12 
Im F, image, 171 
Im z, imaginary part, 12 
Inclusion mapping, 192 
Inconsistent systems, 59 
Independence, linear, 133 
Index, 30 

Index of nilpotency, 330 
Inertia, Law of, 366 
Infinite dimension, 124 
Infinity-norm, 243 
Injective mapping, 168 
Inner product, 4 
complex, 241 
Inner product spaces, 228 
linear operators on, 379 
Integral, 170 
domain, 408 
Invariance, 226 

Invariant subspaces, 226, 328, 334 
direct-sum, 329 
Inverse image, 166 
Inverse mapping, 166 
Inverse matrix, 34, 46, 85 
computing, 85 
inversion, 269 
Invertible: 

matrices, 34, 46 
Irreducible, 408 
Isometry, 383 

Isomorphic vector spaces, 171, 406 

J 

Jordan: 
block, 306 

canonical form, 331, 338 


K 

Ker F, kernel, 171 

Kernel, 171, 172 

Kronecker delta function <L, 33 

L 

L-space, 231 
Laplace expansion, 272 
Law of inertia, 365 
Leading: 
coefficient, 60 
nonzero element, 70 
unknown, 60 

Least square solution, 135 
Legendre polynomial, 239 
Length, 5, 229 
Limits (summation), 30 
Line, 8, 194 


Linear: 

combination, 3, 29, 60, 79, 115 

dependence, 121 

form, 351 

functional, 351 

independence, 121 

span, 119 

Linear equation, 57 
Linear equations (system), 58 
consistent, 59 
echelon form, 65 
triangular form, 64 
Linear mapping (function), 166, 169 
image, 166, 171 
kernel, 171 
nullity, 173 
rank, 173 
Linear operator: 
adjoint, 379 

characteristic polynomial, 306 
determinant, 277 
on inner product spaces, 379 
invertible, 177 
matrix representation, 197 
Linear transformation (See linear mappings), 169 
Located vectors, 7 
LU decomposition, 87, 106 

M 

M,„ ,,, matrix vector space, 114 
Mappings (maps), 166 
bilinear, 361, 398 
composition of, 167 
linear, 169 
matrix, 170 
Matrices: 
congruent, 362 
equivalent, 87 
similar, 205 
Matrix, 27 
augmented, 59 
change-of-basis, 201 
coefficient, 59 
companion, 306 
diagonal, 35 
echelon, 65, 70 
elementary, 84 
equivalence, 87 
Hermitian, 38, 49 
identity, 33 
invertible, 34 
nonsingular, 34 
normal, 38 
orthogonal, 239 
positive definite, 240 
rank, 72, 87 
space, M m „, 114 
square root, 298 
triangular, 36 
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Matrix mapping, 167 
Matrix multiplication, 30 
Matrix representation, 197, 240, 362 
adjoint operator, 379, 386 
bilinear form, 361 
change of basis, 201 
linear mapping, 197 
Metric space, 243 
Minimal polynomial, 305, 307 
Minkowski’s inequality, 5 
Minor, 271, 275 
principle, 275 
Module, 409 

Monic polynomial 305, 413 
Moore-Penrose inverse, 134 
Multilinearity, 278, 401 
Multiplicity, 300 
Multiplier, 67, 73, 87 

N 

ml incar, 278 
n-space, 2 
complex, 13 
real, 2 

Natural mapping, 353 
New basis, 201 
Nilpotent, 330, 338 
Nonnegative semideflnite, 228 
Nonsingular, 112 
linear maps, 174 
matrices, 34 
Norm, 5, 229, 243 
Normal, 7 
matrix, 38 
operator, 382, 385 
Normalized, 229 
Normalizing, 5, 229, 235 
Normed vector space, 243 
Nullity, 173 
nullsp(A), 172 
Null space, 172 

O 

Old basis, 201 
One-norm, 243 
One-to-one: 
correspondence, 168 
mapping, 168 

Onto mapping (function), 168 
Operators (See Linear operators) 
Order, n: 
determinant, 266 
of a group, 410 
Orthogonal, 4, 37, 80 
basis, 233 
complement, 233 
matrix, 239 
operator, 382 
projection, 386 
substitution, 304 


Orthogonalization, Gram-Schmidt, 237 
Orthogonally equivalent, 382 
Orthonormal, 235 
Outer product, 10 

P 

Parameter, 64 
form, 65 

Particular solution, 58 
Partition, 418 
Permutations, 8, 269 
Perpendicular, 4 
Pivot, 67, 71 
row reduction, 94 
variables, 65 

Pivoting (row reduction), 94 
Polar form, 365 
Polynomial, 413 
characteristic, 296, 307 
minimum, 305 
space, P„(f), 114 
Positive definite, 228 
matrices, 240, 366 
operators, 338, 384 
Positive operators, 228 
square root, 393 

Primary decomposition theorem, 

330 

Prime ideal, 412 
Principle ideal ring, 408 
Principle minor, 275 
Product: 
exterior, 403 
inner, 4 
tensor, 398 

Projections, 169, 236, 346, 386 
orthogonal, 386 
Proper value, 298 
vector, 298 

Pythagorean theorem, 235 

Q 

Q, rational numbers, 11 
Quadratic form, 303, 316, 365 
Quotient 

group, 405 
ring, 408 
spaces, 334, 418 

R 

R, real numbers, 1, 12 
R", real n-space, 2 
Range, 166, 171 

Rank, 72, 87, 128, 173, 366 
Rational: 

canonical form, 333 
numbers, Q, 11 
Real: 

numbers, R, 1 

part (complex number), 12 
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Real symmetric bilinear form, 365 
Reduce, 73 
Relation, 417 
Representatives, 418 
Restriction mapping, 194 
Right-handed system, 11 
Right inverse, 191 
Ring, 407 
quotient, 408 
Root, 295 
Rotation, 171 
Row, 27 

canonical form, 72 
equivalence, 72 
operations, 72 
rank, 72 
reduce, 73 

reduced echelon form, 73 
space, 120 

S 

S„, symmetric group, 269, 406 
Scalar, 1, 12 
matrix, 33 
multiplication, 33 
product, 27 
Scaling factor, 298 
Schwarz inequality, 5, 231, 242 
(See Cauchy-Schwarz inequality) 
Second dual space, 352 
Self-adjoint operator, 382 
Sign of permutation, 269 
Signature, 366 
Similar, 205, 226 
Similarity transformation, 205 
Singular, 173 
Size (matrix), 27 
Skew-adjoint operator, 382 
Skew-Hermitian, 38 
Skew-symmetric, 362 
matrix, 36, 48 

Solution, (linear equations), 57 
zero, 121 
Spatial vectors, 9 
Span, 115 
Spanning sets, 115 
Spectral theorem, 385 
Square: 
matrix, 32, 44 

system of linear equations, 58, 72 
Square root of a matrix, 393 
Standard: 
basis, 125 
form, 57, 401 
inner product, 230 
Subdiagonal, 306 
Subgroup, 405 
Subset, 112 
Subspace, 117, 133 
Sum of vector spaces, 131 


Summation symbol, 29 
Superdiagonal, 306 
Surjective map, 168 
Sylvester’s theorem, 366 
Symmetric: 
bilinear form, 363 
matrices, 4, 36 

Systems of linear equations, 58 

T 

Tangent vector, T(?), 9 
Target set, 166 
Tensor product, 398 
Time complexity, 88 
Top-down, 73 
Trace, 33 

Transformation (linear), 169 
Transition matrix, 201 
Transpose: 

linear functional (dual space), 353 
matrix, 32 

Triangle inequality, 232 
Triangular form, 64 
Triangular matrix, 36, 47 
block, 40 
Triple product, 11 
Two-norm, 243 

U 

Unique factorization domain, 408, 416 
Unit vector, 5, 229 
matrix, 33 

Unitary, 38, 49, 382 

Universal mapping principle (UMP), 398 
Usual: 
basis, 125 
inner product, 230 

V 

Vandermonde determinant, 292 
Variable, free, 65 
Vector, 2 
coordinates, 130 
located, 7 
product, 10 
spatial, 9 

Vector space, 112, 228 
basis, 124 
dimension, 124 
Volume, 276 

W 

Wedge (exterior) product, 403 

Z 

Z, integers, 408 
Zero: 

mapping, 128, 170, 175 
matrix, 27 
polynomial, 413 
solution, 121 
vector, 2 



